Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] FastGraph: 10x times faster Pattern Matcher #101

Merged
merged 33 commits into from
Sep 18, 2024

Conversation

hikettei
Copy link
Owner

@hikettei hikettei commented Sep 17, 2024

  • Implement make-node: Check the type of attr slots #100
  • Optimize/Fix bugs in the pattern matcher
    • Compile GPT-2 Model in a reasonable time
    • Compile a Symbolic GPT-2 Model
    • GPT2: 必要なノードがPurgeされるバグの修正
    • Make all lowering get worked initializer.lisp: call -> forward

@hikettei
Copy link
Owner Author

hikettei commented Sep 17, 2024

結局何が遅い?

  • 結論: ノードの順序を気にする(前提とした)コードを書いているから。将来実装するであろうlet-bindingはimmutableだとして話を進めると,GraphはDAGと仮定して良い。(すると,色々最適化できる) DAGじゃない場所はGraph,である場所はFastGraphを使うようにリファクタすればOK (で,method dispatchのoverheadをcompiler-macroでeliminateする)
  • データ構造 (ListじゃなくてArrayにした方がいい気がする)
    • データの性質: 最初にmake-graphするときに指定したListが最大の長さ
    • それ以降はSimplifierが要らないノードを削除するので要素数が少なくなる
    • ただ,ModuleのLoweringでノード数が増えるので難しい
    • ↑を解決したら固定長のSimple-Arrayを使えないか
  • 全体の計算量の見直し
    • resolve-isolated-graphが遅い -> FastGraphでZeroCostにする
  • Moduleの再利用性をうまく使えばCacheを作れそうだけど,バグが増えそうなので一旦検討しない
  • GraphをCLOS Classにする,FastGraphを導入する。(nodesはarrayになる,要素数が初期値より増えないという仮定が必要になる)
    • 前のベンチマークから,不要なノード削除して計算量のNが少なくなるメリットの方が大きかった
    • FastGraphはグラフに対してCacheを作るデータ構造として使う ( nodesをHashTableにする )
    • データの構造の扱い方が悪い。(Pattern Matcherもgraph-nodesを何度も再Allocateしたりしてるから遅い)
    • ListをShuffleしてもコンパイル先が一意に定まるように仮定したい
    • Graphは残す、FastGraphはコピーを行う操作でエラー吐くようにする
    • ノードの頂点をHashTableにする
    • (with-fast-graph (bind graph) <no node copies are allowed>)
    • pattern matcher: 各leafを(gethash id xxx)にする,pattern_match(id)のイメージ
      • pattern_match(id)の操作を具体的に考えてみる:
      • Refactor: avm-forward-output-idsをPAUSE/BACKWARDにする
      • Prep Refactor: graph-nodes = ...してるコードを全て置き換える 
      • Running the same test for both: Graph and FastGraph
      • compiler-macro

コンパイルの手順を書き下ろしてみる

  • 再検討すべき箇所: iseq.lisp pattern-matcher.lisp
[最初のグラフ (LoweringされてないModule + LoweringされたIRが混在している)]
↓
[Module + LoweringされたIRに対してのSimplifier]
↓
[Lowering]
↓
[Lowered IRだけのグラフ (要素数最大)]
↓
[Simplify+Constant Folding]
↓
[VM Graph完成]
↓
[JITに続く (Simplifier後のグラフを扱うので大した計算量にはならないが,よくよく考えたら大量にappendとか使ってるのはどうにかした方がいい)]

Pattern Matcherを考え直す


@hikettei
Copy link
Owner Author

hikettei commented Sep 17, 2024

Deploying to FastGraph will 10x times faster than using Graph!

CATEN> (defparameter *sum-base-graph*
	         (let ((*external-simplifiers* nil))
	         (avm-graph(caten(!mean (!mean (!mean (!mean (!mean (make-tensor `(10 10 10)))))))))))

CATEN> (progn (time (caten/aasm::%0_fuse_load_alloc (->fast-graph *sum-base-graph*))) nil)
Evaluation took:
  0.000 seconds of real time
  0.000122 seconds of total run time (0.000113 user, 0.000009 system)
  100.00% CPU
  131,056 bytes consed
  
NIL
CATEN> (progn (time (caten/aasm::%0_fuse_load_alloc *sum-base-graph*)) nil)
Evaluation took:
  0.001 seconds of real time
  0.001721 seconds of total run time (0.001569 user, 0.000152 system)
  200.00% CPU
  196,560 bytes consed
  
NIL
CATEN> 

@hikettei
Copy link
Owner Author

hikettei commented Sep 17, 2024

Workload

  • Refactor: 1. リストをShuffleしてもグラフ構造が一意に定まることを保証する。(PAUSE/BACKWARDがノードの終着点になるようにする)
  • Refactor: 2.compiler-macro and inlining methods
  • Refactor: 3. (setf (graph-nodes graph) ...) してるところを./caten/air/graph.lispのAPIで置き換える,もし必要であれば追加する。
  • Refactor: 4. 重たい箇所をGraph -> FastGraphに書き換えてベンチマークする
  • Export_to_c/export_to_dot
  • SAFETY=3 not to use FastGraph
  • Goal: nearly 0 bytes consed
  • Optimize: verify-graph ((Graph Graph))
  • Fix for backward
  • Optimize: try-fold-constant
  • Testing GPT2 Compilation (Symbolic, Speed, Validity)

@hikettei hikettei changed the title [Enhancement] Revise the whole algorithm of pattern matcher [Refactor] Revise the whole algorithm of pattern matcher Sep 17, 2024
@hikettei hikettei changed the title [Refactor] Revise the whole algorithm of pattern matcher [Refactor] 10x times faster Pattern Matcher Sep 17, 2024
@hikettei hikettei changed the title [Refactor] 10x times faster Pattern Matcher [Refactor] FastGraph: 10x times faster Pattern Matcher Sep 17, 2024
@hikettei
Copy link
Owner Author

  • (Another PR) Optimize JIT

@hikettei
Copy link
Owner Author

20.4x times improvements in randn

;; BASELINE
CATEN> (ctx:with-contextvar (:safety 1)
	 (progn (time (caten (!randn `(a b)))) nil))

Evaluation took:
  10.359 seconds of real time
  10.354411 seconds of total run time (10.284705 user, 0.069706 system)
  99.95% CPU
  291,400,592 bytes consed
  
NIL
;; THIS PR
CATEN> (ctx:with-contextvar (:safety 0)
	 (progn (time (caten (!randn `(a b)))) nil))
Evaluation took:
  0.506 seconds of real time
  0.506286 seconds of total run time (0.494233 user, 0.012053 system)
  100.00% CPU
  71,878,992 bytes consed
  
NIL
CATEN> 

@hikettei
Copy link
Owner Author

hikettei commented Sep 18, 2024

  • inlining CLOS methods by compiler-macro
  • Optimize: defsimplifier
  • Optimize: fold-constant should use FastGraph
  • Optimize: verify-graph for Graph
  • Goal: nearly 0 bytes consed
  • Testing GPT2
  • inline make-attr 0.000 | 0.000 | 0 | 3 | 0.000041 | CATEN/AIR::MAKE-ATTR

@hikettei

This comment was marked as outdated.

@hikettei
Copy link
Owner Author

;; running 10 times
(ctx:with-contextvar (:safety 0)
	 (progn (time (caten (!randn `(a b)))) nil))
  seconds  |     gc     |     consed    |   calls   |  sec/call  |  name  
---------------------------------------------------------------
     1.734 |      0.000 |    35,317,584 |     9,550 |   0.000182 | CATEN/AIR::RESOLVE-ISOLATED-NODES
     0.063 |      0.000 |    20,118,208 |       185 |   0.000341 | CATEN/AIR:->GRAPH
     0.022 |      0.000 |     2,227,712 |     4,280 |   0.000005 | CATEN/APIS::%TPSORT-TENSORS
     0.015 |      0.000 |    53,181,888 |   173,690 |   0.000000 | CATEN/AIR::%MAKE-NODE-INLINED
     0.006 |      0.000 |     6,094,624 |        20 |   0.000294 | CATEN/AASM::FUSE-DUPLICATED-STORE
     0.006 |      0.000 |    41,012,784 |    36,685 |   0.000000 | CATEN/AIR:INSERT-NODES
     0.003 |      0.000 |     4,897,856 |    40,680 |   0.000000 | CATEN/APIS::SESSION/ASSIGN
     0.002 |      0.000 |    37,088,688 |   284,495 |   0.000000 | (SETF CATEN/AIR:GRAPH-NODES)
     0.002 |      0.000 |     1,572,592 |    40,245 |   0.000000 | CATEN/AIR::%GETATTR
     0.001 |      0.000 |     3,407,120 |    16,625 |   0.000000 | CATEN/APIS:MAKE-TENSOR
     0.001 |      0.000 |       262,080 |        15 |   0.000066 | CATEN/AIR:VERIFY-ARGS
     0.001 |      0.000 |     2,618,688 |     4,065 |   0.000000 | CATEN/APIS::MAKE-COMPILER-SESSION
     0.001 |      0.000 |       720,880 |     5,965 |   0.000000 | CATEN/APIS::SESSION/READGRAD
     0.001 |      0.000 |     7,404,576 |    40,245 |   0.000000 | CATEN/AIR:GETATTR
     0.000 |      0.000 |     1,638,128 |     4,070 |   0.000000 | CATEN/APIS::SESSION/UPDATE-OUTPUTS
     0.000 |      0.000 |       196,560 |        15 |   0.000017 | CATEN/AIR::ATTRIBUTE->INSTANCE
     0.000 |      0.000 |       327,600 |     9,785 |   0.000000 | (SETF CATEN/APIS:FUNC-VARIABLES)
     0.000 |      0.000 |     2,685,232 |    17,060 |   0.000000 | CATEN/APIS::%INTERNAL-MAKE-TENSOR
     0.000 |      0.000 |             0 |        15 |   0.000013 | CATEN/AIR::MAKE-ATTR
     0.000 |      0.000 |       654,880 |     1,450 |   0.000000 | CATEN/AIR::SPECIAL-P
     0.000 |      0.000 |             0 |         5 |   0.000031 | CATEN/APIS::SESSION/SYNC-MULTI-GRADS
     0.000 |      0.000 |       327,680 |     5,160 |   0.000000 | CATEN/AASM::NODE->ID1
     0.000 |      0.000 |             0 |       210 |   0.000000 | CATEN/APIS::SESSION/SETGRAD
     0.000 |      0.000 |             0 |        15 |   0.000003 | CATEN/APIS::SYMB
     0.000 |      0.000 |             0 |       435 |   0.000000 | CATEN/APIS:NDIM
     0.000 |      0.000 |             0 |       505 |   0.000000 | CATEN/APIS:SHAPE
     0.000 |      0.000 |             0 |        15 |   0.000001 | CATEN/AIR::%MAKE-NODE
     0.000 |      0.000 |             0 |        15 |   0.000001 | (SETF CATEN/APIS::MODULE-LOWER-OUTPUTS)
     0.000 |      0.000 |             0 |       635 |   0.000000 | CATEN/APIS::VIEWRANGE-BY
     0.000 |      0.000 |             0 |       635 |   0.000000 | CATEN/APIS::VIEWRANGE-FROM
     0.000 |      0.000 |             0 |       435 |   0.000000 | CATEN/APIS::VIEW-NRANK
     0.000 |      0.000 |             0 |         5 |   0.000002 | CATEN/APIS::CAST-DTYPE-FRM
     0.000 |      0.000 |             0 |        40 |   0.000000 | CATEN/APIS::SESSION/SET-MULTI-GRAD
     0.000 |      0.000 |             0 |        25 |   0.000000 | CATEN/APIS::MOVE-REDUCTION
     0.000 |      0.000 |        65,488 |        40 |   0.000000 | CATEN/APIS::ALLOC-ID
     0.000 |      0.000 |             0 |        15 |   0.000001 | (SETF CATEN/APIS:MODULE-OUTPUTS)
     0.000 |      0.000 |             0 |       635 |   0.000000 | CATEN/APIS::VIEWRANGE-TO
     0.000 |      0.000 |             0 |        35 |   0.000000 | CATEN/APIS::MODULE-IMPL-ISEQ
     0.000 |      0.000 |             0 |       635 |   0.000000 | CATEN/APIS::VIEWRANGE-SIZE
     0.000 |      0.000 |             0 |        15 |   0.000000 | CATEN/AIR::VERIFY-ATTRS
     0.000 |      0.000 |             0 |        10 |   0.000001 | CATEN/APIS::COSNODE
     0.000 |      0.000 |       196,608 |    10,340 |   0.000000 | CATEN/APIS::FUNC-REDUCE
     0.000 |      0.000 |             0 |        10 |   0.000001 | CATEN/AASM:%SIN
     0.000 |      0.000 |             0 |        10 |   0.000001 | CATEN/AASM:%SQRT
     0.000 |      0.000 |             0 |         5 |   0.000001 | CATEN/AASM:%LOG2
     0.000 |      0.000 |             0 |         5 |   0.000001 | CATEN/APIS::THREEFRY2X32-RANDOM
     0.000 |      0.000 |             0 |       245 |   0.000000 | CATEN/APIS::CAST-DTYPE-TO
     0.000 |      0.000 |             0 |        40 |   0.000000 | CATEN/APIS::MODULE-LOWER-OUTPUTS
     0.000 |      0.000 |             0 |        30 |   0.000000 | CATEN/AIR::VERIFY-BUFFERS
     0.000 |      0.000 |             0 |         5 |   0.000001 | CATEN/APIS:!RAND
     0.000 |      0.000 |             0 |        15 |   0.000000 | (SETF CATEN/APIS::MODULE-IMPL-ISEQ)
     0.000 |      0.000 |             0 |       435 |   0.000000 | CATEN/APIS::VIEW-VIEWS
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:ORDER
     0.000 |      0.000 |             0 |        15 |   0.000000 | CATEN/APIS:MODULE-ATTRS
     0.000 |      0.000 |             0 |       635 |   0.000000 | CATEN/APIS::VIEWRANGE-BROADCAST
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS::SESSION/SET-TID
     0.000 |      0.000 |       327,680 |    21,035 |   0.000000 | CATEN/APIS::ALLOC-FROM
     0.000 |      0.000 |       131,072 |     2,715 |   0.000000 | CATEN/APIS::ADD-WRAP-AROUND
     0.000 |      0.000 |        65,536 |       365 |   0.000000 | CATEN/APIS::RESHAPE-ORDER
     0.000 |      0.000 |     1,376,000 |        15 |   0.000000 | CATEN/APIS::%MAKE-GRAPH-BACKWARD
     0.000 |      0.000 |        65,536 |     2,850 |   0.000000 | CATEN/APIS::->CONST
     0.000 |      0.000 |             0 |        40 |   0.000000 | CATEN/APIS::CLONE-LIKE
     0.000 |      0.000 |             0 |     2,715 |   0.000000 | CATEN/APIS::FUNC-ID
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS::%MODULE->ISEQBW
     0.000 |      0.000 |       393,152 |     2,850 |   0.000000 | CATEN/APIS::->ICONST
     0.000 |      0.000 |        65,536 |     1,925 |   0.000000 | CATEN/APIS::RESHAPE-SHAPE-AF
     0.000 |      0.000 |       589,760 |    26,815 |   0.000000 | CATEN/APIS::AT-NAME
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS::%MODULE->ISEQFW
     0.000 |      0.000 |     3,276,240 |    46,500 |   0.000000 | CATEN/APIS::MAKE-AT
     0.000 |      0.000 |       524,176 |     6,740 |   0.000000 | CATEN/APIS::MUL-WRAP-AROUND
     0.000 |      0.000 |             0 |       110 |   0.000000 | CATEN/APIS::!IDIV1
     0.000 |      0.000 |       458,640 |        15 |   0.000000 | CATEN/APIS::%MODULE-OBJ->ISEQFW
     0.000 |      0.000 |    57,337,008 |     4,085 |   0.000000 | CATEN/APIS::%LOWER-ISEQ
     0.000 |      0.000 |       720,640 |     1,265 |   0.000000 | CATEN/APIS::MAKE-SCALAR
     0.000 |      0.000 |        65,504 |         5 |   0.000000 | CATEN/APIS::!THREEFRY2X32
     0.000 |      0.000 |        65,520 |       435 |   0.000000 | CATEN/APIS::MERGE-VIEWS
     0.000 |      0.000 |    15,262,272 |    14,025 |   0.000000 | CATEN/APIS::%SOLVE-ST
     0.000 |      0.000 |       262,080 |        70 |   0.000000 | CATEN/APIS::.COMPOSE-VIEWS
     0.000 |      0.000 |     2,555,632 |    82,045 |   0.000000 | CATEN/APIS::SESSION/READ
     0.000 |      0.000 |       262,080 |       635 |   0.000000 | CATEN/APIS::PARSE-VIEW-SUBSCRIPT
     0.000 |      0.000 |       131,008 |       705 |   0.000000 | CATEN/APIS::MAKE-VRANGE
     0.000 |      0.000 |       196,560 |     1,270 |   0.000000 | CATEN/APIS::VRANGE-SIZE
     0.000 |      0.000 |       982,880 |         5 |   0.000000 | CATEN/APIS::%MAKE-GRAPH-FROM-ISEQ
     0.000 |      0.000 |     5,635,216 |     2,635 |   0.000000 | CATEN/APIS::%OBTAIN-FOLD-CONSTANT-RESULT
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS::!SHR
     0.000 |      0.000 |       327,600 |     3,175 |   0.000000 | CATEN/APIS::SFOLD
     0.000 |      0.000 |             0 |     1,175 |   0.000000 | CATEN/APIS::RESHAPE-SHAPE-BF
     0.000 |      0.000 |     2,096,912 |    40,045 |   0.000000 | CATEN/APIS::ALLOC-INITIAL-ELEMENT
     0.000 |      0.000 |     2,227,904 |     4,305 |   0.000000 | CATEN/APIS::BROADCAST-ELWISE
     0.000 |      0.000 |     1,179,408 |     5,415 |   0.000000 | CATEN/APIS::%BROADCAST-AUTO
     0.000 |      0.000 |       917,376 |     4,060 |   0.000000 | CATEN/APIS::%TENSOR->AASM
     0.000 |      0.000 |       196,576 |    21,115 |   0.000000 | CATEN/APIS::ALLOC-BUFFER
     0.000 |      0.000 |       589,776 |    14,025 |   0.000000 | CATEN/APIS::MAKE-ST
     0.000 |      0.000 |     1,965,824 |     3,810 |   0.000000 | CATEN/APIS::TRY-FOLD-CONSTANT
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS::%LOWER-MODULES
     0.000 |      0.000 |        65,536 |       395 |   0.000000 | CATEN/APIS:!RESHAPE
     0.000 |      0.000 |       458,688 |     1,360 |   0.000000 | CATEN/APIS:!RECIP
     0.000 |      0.000 |        65,520 |        70 |   0.000000 | CATEN/APIS:!>
     0.000 |      0.000 |        65,472 |       165 |   0.000000 | CATEN/APIS:UCONST
     0.000 |      0.000 |       131,056 |       255 |   0.000000 | CATEN/APIS:!CAST
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:!SQRT
     0.000 |      0.000 |             0 |        25 |   0.000000 | CATEN/APIS:!MOVE
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:!INDEX-COMPONENTS
     0.000 |      0.000 |             0 |        70 |   0.000000 | CATEN/APIS:!LCM
     0.000 |      0.000 |        65,520 |       210 |   0.000000 | CATEN/APIS:!EQ
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:!VIEW-FROM-BASE
     0.000 |      0.000 |             0 |     1,345 |   0.000000 | CATEN/APIS:!DIV
     0.000 |      0.000 |             0 |       525 |   0.000000 | CATEN/APIS:!CONTIGUOUS
     0.000 |      0.000 |             0 |       290 |   0.000000 | CATEN/APIS:!*
     0.000 |      0.000 |     2,817,712 |    82,685 |   0.000000 | CATEN/APIS:FUNC-VARIABLES
     0.000 |      0.000 |       131,056 |       345 |   0.000000 | CATEN/APIS:!UPRANK
     0.000 |      0.000 |     2,096,416 |     5,270 |   0.000000 | CATEN/APIS:ICONST
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:!OR
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:%COMPILE-TOPLEVEL
     0.000 |      0.000 |       131,040 |       425 |   0.000000 | CATEN/APIS:!VIEW
     0.000 |      0.000 |             0 |       205 |   0.000000 | CATEN/APIS:BACKWARD
     0.000 |      0.000 |       327,680 |    10,175 |   0.000000 | CATEN/APIS:TENSOR-P
     0.000 |      0.000 |       327,296 |       140 |   0.000000 | CATEN/APIS:FCONST
     0.000 |      0.000 |             0 |       105 |   0.000000 | CATEN/APIS:!XOR
     0.000 |      0.000 |     1,310,448 |       490 |   0.000000 | CATEN/APIS:!WHERE
     0.000 |      0.000 |       131,040 |        15 |   0.000000 | CATEN/APIS:IMPL
     0.000 |      0.000 |     8,321,792 |     8,955 |   0.000000 | CATEN/APIS:FORWARD
     0.000 |      0.000 |       393,152 |     2,240 |   0.000000 | CATEN/APIS:!MUL
     0.000 |      0.000 |             0 |        70 |   0.000000 | CATEN/APIS:!GCD
     0.000 |      0.000 |             0 |       140 |   0.000000 | CATEN/APIS:!+
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:!IDIV
     0.000 |      0.000 |             0 |        40 |   0.000000 | CATEN/APIS:DTYPE-OF
     0.000 |      0.000 |             0 |       210 |   0.000000 | CATEN/APIS:!>=
     0.000 |      0.000 |        65,520 |     1,495 |   0.000000 | CATEN/APIS:!NEG
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:!SIN
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:CATEN
     0.000 |      0.000 |       982,912 |     1,825 |   0.000000 | CATEN/APIS:!ADD
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:!RANDN
     0.000 |      0.000 |        65,536 |     1,275 |   0.000000 | CATEN/APIS:!SUB
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:CALL
     0.000 |      0.000 |             0 |        10 |   0.000000 | CATEN/APIS:!COS
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:!LOG
     0.000 |      0.000 |             0 |        15 |   0.000000 | CATEN/APIS:!AND
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/APIS:!CONST
     0.000 |      0.000 |             0 |       605 |   0.000000 | CATEN/APIS:TENSOR-ID
     0.000 |      0.000 |             0 |       210 |   0.000000 | CATEN/APIS:!SIGNUM
     0.000 |      0.000 |             0 |       140 |   0.000000 | CATEN/APIS:!MAXIMUM
     0.000 |      0.000 |             0 |        60 |   0.000000 | CATEN/APIS:MODULE-OUTPUTS
     0.000 |      0.000 |       655,088 |       435 |   0.000000 | CATEN/APIS:MAKE-VIEW-INTERNAL
     0.000 |      0.000 |             0 |        70 |   0.000000 | CATEN/APIS:!MINIMUM
     0.000 |      0.000 |    46,327,856 | 1,342,430 |   0.000000 | CATEN/AIR::%GRAPH-NODES
     0.000 |      0.000 |    10,812,080 |   284,495 |   0.000000 | (SETF CATEN/AIR::%GRAPH-NODES)
     0.000 |      0.000 |    16,775,200 |     9,550 |   0.000000 | CATEN/AIR::PURGE-ISOLATED-GRAPH
     0.000 |      0.000 |     6,552,752 |    40,265 |   0.000000 | CATEN/AIR:ID->USERS
     0.000 |      0.000 |     3,734,992 |   122,860 |   0.000000 | CATEN/AIR:NODE->ID
     0.000 |      0.000 |    69,594,544 |   556,300 |   0.000000 | CATEN/AIR:ID->VALUE
     0.000 |      0.000 |    14,284,688 |    87,175 |   0.000000 | CATEN/AIR:MAKE-GRAPH
     0.000 |      0.000 |   187,342,272 | 1,343,270 |   0.000000 | CATEN/AIR:GRAPH-NODES
     0.000 |      0.000 |     7,863,264 |   228,860 |   0.000000 | CATEN/AIR:NODE-TYPE
     0.000 |      0.000 |    39,577,808 |    40,680 |   0.000000 | CATEN/AIR:LOWER
     0.000 |      0.000 |       327,600 |       175 |   0.000000 | CATEN/AIR:->FAST-GRAPH
     0.000 |      0.000 |    30,537,424 |   937,795 |   0.000000 | CATEN/AIR:%GRAPH-NODES-TABLE
     0.000 |      0.000 |             0 |       165 |   0.000000 | (SETF CATEN/AIR:%GRAPH-NODES-TABLE)
     0.000 |      0.000 |    27,060,800 |    35,235 |   0.000000 | CATEN/AIR:REMNODE
     0.000 |      0.000 |       327,616 |     9,910 |   0.000000 | CATEN/AIR:GRAPH-SEEN
     0.000 |      0.000 |     1,769,232 |    76,325 |   0.000000 | CATEN/AIR:NODE-P
     0.000 |      0.000 |    17,758,480 |   540,000 |   0.000000 | CATEN/AIR:GRAPH-OUTPUTS
     0.000 |      0.000 |             0 |     8,140 |   0.000000 | (SETF CATEN/AIR:GRAPH-OUTPUTS)
     0.000 |      0.000 |             0 |        15 |   0.000000 | CATEN/AIR:MAKE-NODE
     0.000 |      0.000 |     6,356,000 |   177,380 |   0.000000 | CATEN/AIR:NODE-WRITES
     0.000 |      0.000 |   132,957,440 |    80,530 |   0.000000 | CATEN/AIR:ID->NODE
     0.000 |      0.000 |     1,244,944 |    40,680 |   0.000000 | CATEN/AIR:GRAPH-P
     0.000 |      0.000 |    39,841,648 | 1,272,750 |   0.000000 | CATEN/AIR:NODE-ID
     0.000 |      0.000 |    13,366,432 |     9,715 |   0.000000 | CATEN/AIR:VERIFY-GRAPH
     0.000 |      0.000 |             0 |     8,040 |   0.000000 | CATEN/AIR:NODE-CLASS
     0.000 |      0.000 |        65,536 |       600 |   0.000000 | CATEN/AASM::REINITIALIZE-TENSOR
     0.000 |      0.000 |    64,677,216 |     5,580 |   0.000000 | CATEN/AASM::%2_UNFOLD_LOAD_ALLOC
     0.000 |      0.000 |     3,526,096 |        20 |   0.000000 | CATEN/AASM::SIMPLIFY-DYNAMIC-ARITHMETIC
     0.000 |      0.000 |       262,096 |     1,475 |   0.000000 | CATEN/AASM::%ROW-MAJOR-CALC-STRIDES
     0.000 |      0.000 |       196,576 |     3,455 |   0.000000 | CATEN/AASM::SIZE-P
     0.000 |      0.000 |    80,990,240 |     5,580 |   0.000000 | CATEN/AASM::%0_FUSE_LOAD_ALLOC
     0.000 |      0.000 |    87,021,808 |     5,580 |   0.000000 | CATEN/AASM::%1_FOLD_CONSTANT
     0.000 |      0.000 |             0 |       245 |   0.000000 | CATEN/AASM:%CAST
     0.000 |      0.000 |     1,506,816 |     1,030 |   0.000000 | CATEN/AASM:%<
     0.000 |      0.000 |     1,310,096 |     1,930 |   0.000000 | CATEN/AASM:%RECIP
     0.000 |      0.000 |       196,544 |       260 |   0.000000 | CATEN/AASM:%GCD
     0.000 |      0.000 |    35,903,728 |    43,425 |   0.000000 | CATEN/AASM:%LOAD
     0.000 |      0.000 |       458,720 |       365 |   0.000000 | CATEN/AASM:%RESHAPE
     0.000 |      0.000 |       458,704 |       370 |   0.000000 | CATEN/AASM:%MOVE
     0.000 |      0.000 |       131,008 |       600 |   0.000000 | CATEN/AASM:INFER-TENSOR-INFO
     0.000 |      0.000 |       589,792 |       670 |   0.000000 | CATEN/AASM:%ALLOC
     0.000 |      0.000 |       524,224 |     1,030 |   0.000000 | CATEN/AASM:%<=
     0.000 |      0.000 |    26,268,784 |    45,130 |   0.000000 | CATEN/AASM:%SALLOC
     0.000 |      0.000 |             0 |        20 |   0.000000 | CATEN/AASM:OPTIMIZE-AASM
     0.000 |      0.000 |       131,072 |       680 |   0.000000 | CATEN/AASM:%SHAPE
     0.000 |      0.000 |     2,293,248 |     1,750 |   0.000000 | CATEN/AASM:%WHERE
     0.000 |      0.000 |    55,040,352 |     5,580 |   0.000000 | CATEN/AASM:FOLD-CONSTANT
     0.000 |      0.000 |     2,358,432 |     3,500 |   0.000000 | CATEN/AASM:%NOT
     0.000 |      0.000 |       262,144 |       720 |   0.000000 | CATEN/AASM:%>=
     0.000 |      0.000 |             0 |     1,475 |   0.000000 | CATEN/AASM:%STRIDE
     0.000 |      0.000 |        65,520 |       105 |   0.000000 | CATEN/AASM:%XOR
     0.000 |      0.000 |        65,536 |     2,470 |   0.000000 | CATEN/AASM:%=
     0.000 |      0.000 |     3,997,232 |    21,385 |   0.000000 | CATEN/AASM:%MAKE-TENSOR
     0.000 |      0.000 |       982,976 |       435 |   0.000000 | CATEN/AASM:%VIEW
     0.000 |      0.000 |       589,776 |     2,480 |   0.000000 | CATEN/AASM:%ICONST
     0.000 |      0.000 |     1,506,912 |     2,715 |   0.000000 | CATEN/AASM:%ADD
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/AASM:%IDIV
     0.000 |      0.000 |     1,572,560 |     1,760 |   0.000000 | CATEN/AASM:%OR
     0.000 |      0.000 |       458,656 |     1,030 |   0.000000 | CATEN/AASM:%>
     0.000 |      0.000 |     2,292,960 |     2,595 |   0.000000 | CATEN/AASM:%NEG
     0.000 |      0.000 |     1,834,560 |     2,470 |   0.000000 | CATEN/AASM:%!=
     0.000 |      0.000 |        65,536 |       620 |   0.000000 | CATEN/AASM:%MAX
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/AASM:%FCONST
     0.000 |      0.000 |             0 |         5 |   0.000000 | CATEN/AASM:%INDEX-COMPONENTS
     0.000 |      0.000 |             0 |        15 |   0.000000 | CATEN/AASM:%AND
     0.000 |      0.000 |     5,503,136 |     7,135 |   0.000000 | CATEN/AASM:%MUL
---------------------------------------------------------------
     1.859 |      0.000 | 1,398,398,704 | 8,580,985 |            | Total

@hikettei hikettei mentioned this pull request Sep 18, 2024
15 tasks
@hikettei hikettei marked this pull request as ready for review September 18, 2024 08:30
@hikettei hikettei merged commit e6aea22 into main Sep 18, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant