Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extra comments on getting started #171

Merged
merged 1 commit into from
Nov 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 34 additions & 13 deletions docs/getting-started.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -22,48 +22,62 @@
(defun present (&rest tensors)
"Present compiles and executes the given tensor, then prints the result."
(format t "~{~& =>~% ~A~}" (multiple-value-list (apply #'proceed tensors))))
;;; ~~~[1. High Level Intercace (caten/apis)]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;;; ~~~[1. High Level Interface (caten/apis)]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;;; The main role of `caten/apis` is to provide matrix operation APIs **with the same interface as Numpy/PyTorch**.
;;; Like Petalisp/tinygrad, Caten uses lazy evaluation.

;; For example, creating a 3x3 matrix initialized with 1.0 using make-tensor doesn't trigger any computation.
(print (make-tensor `(3 3) :initial-element 1.0)) ;; Getting nothing!
;; For example, creating a 3x3 matrix initialized with 1.0 using make-tensor doesn't trigger any computation, that's what we call lazy evaluation.
(defparameter *tensor1* (make-tensor `(3 3) :initial-element 1.0))

(print *tensor1*) ;; Getting nothing! The tensor isn't evaluated. This is called lazy evaluation.

;; If you want to see the result, you have to compile the kernel using the function `proceed`.
(print (proceed (make-tensor `(3 3) :initial-element 1.0)))
(print (proceed *tensor1*))


;; Let's define another tensor for the next experiments

(defparameter *tensor2* (make-tensor `(3 3) :initial-element 1.0))

;; To execute a previously compiled graph without recompiling, create an `AVM` using the `caten` function, then execute it with forward.
;; (that is: proceed = caten + forward)
(print (caten (!matmul (make-tensor `(3 3)) (make-tensor `(3 3)))))
(print (proceed (!matmul (!randn `(3 3)) (!randn `(3 3)))))
;; The graph created consists of a matrix multiplication, passing two tensors
;; The caten function creates the low-level graph, that is a compiled (low level) version of the matmul as an AST
(print (caten (!matmul *tensor1* *tensor2*)))
;; Proceed computes the lower level AST
(print (proceed (!matmul *tensor1* *tensor2* )))

;; Of course, Caten is designed so that all graphs can be compiled with dynamic shapes. There's no need to recompile every time the batch_size changes.

;;; The goal of `caten/apis` is to prevent bugs by wrapping the low-level interface commands (described later) in a high-level API.
;;; You can use various optimizations by lowering the AST in `caten/apis` into `caten/air`!
;;; => In the next section, we will learn about `caten/air`.

;;; ~~~[Low Level Interface (caten/air)]~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
;;; In general, a deep learning compiler is a program that provides transformations and operations on a DAG (Directed Acyclic Graph).
;;; Caten implements a general-purpose graph processing library called `caten/air`.
;;; Any data structure with a graph structure in Caten's source code should be defined as a `caten/air:Graph`.
;;; In general, a deep learning compiler is a program that lowers transformations and operations on a DAG (Directed Acyclic Graph).
;;; Caten implements a general-purpose DAG processing library called `caten/air`.
;;; Any data structure with a DAG structure in Caten's source code should be defined as a `caten/air:Graph`.

(print (make-node :BinaryOps :ADD (list 'a) (list 'b 'c) :reduction t))
(print (make-graph (make-node :BinaryOps :ADD (list 'a) (list 'b 'c))))

;;; Any functions starting with `%` represents for the air node creation.
;;; Any functions starting with `%` represents low level operations for the air node creation.
;;; By wrapping such graph constructions with the with-context macro, Caten automatically organizes them into an executable graph.
;;; - caten/avm:%realize to run the graph in AVM
;;; - caten/aasm:->dot to open the graph in your browser


;; Example:
;; The graph defines a low level graph that performs addition, two constants are defined and added together
;; .dot requires graphdotviz!
(let ((graph
(with-context
(x (%fconst 1.0))
(y (%fconst 2.0))
(out (%add x y)))))
(print graph)
(print (%realize graph))
(->dot graph))
;;(->dot graph)
)

;;; - Nodes can be defined using defnode.
;;; - Pattern Matcher can be defined using `defsimplifier`
Expand Down Expand Up @@ -132,6 +146,7 @@
(apply-rewriting-rules avm))

(defparameter *width* 120)

(defun saying (number title object)
(dotimes (i *width*) (princ "="))
(fresh-line)
Expand Down Expand Up @@ -178,21 +193,26 @@
(vm (make-avm graph :axpy-demo nil (list 'z) nil)))
(declare (ignore _))
(fresh-line)
(saying 1 "Compiling the following initial computation graph:" graph)
(saying 1 "Compiling the following initial
computation graph:" graph)
(saying 2 "Created AVM (Abstract VM) with the computation graph:" vm)
;; caten/codegen requires the shape of all computation nodes to be known!
(run-shape-inference vm)
;; Ready for running the scheduler. `graph-schedule` to partition the input graph.
;; Pass the wrapped avm graph into the `graph-schedule` which divides the graph into smaller units called `schedule-items`
(let ((schedule-graph (graph-schedule (avm-graph vm)))
(*expr-cache* (make-expr-cache))
(renderer (make-instance 'CStyle-Renderer)))
(saying 3 "Generated schedule-graph with the computation graph" schedule-graph)
;; After the `schedule-items` are created, the items are lowered into blueprint suitable code for code-generation
(dolist (item (graph-nodes schedule-graph))
;; If schedule-item was labelled as jitable, you can lower this
(when (getattr item :jitable)
;;Each item from the AST is lowered using the lower-schedule-item function
(lower-schedule-item item (avm-graph vm) schedule-graph)
(saying 4 "Lowered schedule item to a blueprint suitable for code generation:" (print-blueprint (getattr item :blueprint) nil))
(schedule-item-write-define-global item)
;; With the blueprint code, a C kernel is generated
(let ((c-kernel (%render-kernel renderer item)))
(saying 5 "Generated C code from the blueprint:" c-kernel)
(setf (getattr item :rendered-object) c-kernel))))
Expand All @@ -202,6 +222,7 @@
(setf (avm-graph vm) (schedule-graph->avm-graph (avm-graph vm) schedule-graph))
(avm-reset vm)
;; Try axpy!
;; Finally, the C code is executed
(saying
6 "Running the computation X(3x3) + Y(3x3), the result is:"
(%run vm (cons 'x (linspace `(3 3) 1 0)) (cons 'y (linspace `(3 3) 1 0))))))
Expand Down