[JIT] Kernel Creation, Improving memory-planner, etc... #32

hikettei · 2024-08-23T07:11:29Z

hikettei · 2024-08-24T04:53:08Z

Softmax is getting sophisticated. Plus, we can create a tmpvar against val_1 and val_11 because they are independent of c0.
(Is it ok to use a metadata like :reduction?)

CATEN> (caten (!softmax (make-tensor `(3 3))))
WARNING: WIP: MaxOp
Compiled[e4]:
#include <math.h>
#include <stdint.h>
#define boolean _Bool
#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

/*
Arrays:
  - val_11[float32]: (3 1) // IO, TMP
  - val_0[float32]: (3 3) // IO, USER
  - val_1[float32]: (3 1) // IO, TMP
*/
void main9327315_e4_k0(float* val_11, float* val_0, float* val_1);
void main9327315_e4_k0(float* val_11, float* val_0, float* val_1) {
  for(int c0=0;(c0<=2);c0+=1) {
    val_1[c0+0] = 0.0;
    for(int c1=0;(c1<=2);c1+=1) {
      val_1[c0+0] = max(val_1[c0+0], val_0[3*c0+c1]);
    }
    val_11[c0+0] = 0.0;
    val_1[c0+0] = -(val_1[c0+0]);
    for(int c1=2;(c1<=4);c1+=1) {
      val_0[3*c0+(c1-2)] = exp2(((val_0[3*c0+(c1-2)]+val_1[c0+0])*1.442695));
      val_11[c0+0] = (val_11[c0+0]+val_0[3*c0+(c1-2)]);
    }
    val_11[c0+0] = 1/(val_11[c0+0]);
    for(int c1=4;(c1<=6);c1+=1) {
      val_0[3*c0+(c1-4)] = (val_0[3*c0+(c1-4)]*val_11[c0+0]);
    }
  }
}

hikettei · 2024-08-24T06:10:41Z

趣旨は一時領域関連の最適化と、カスタムカーネルへの布石なので、reductionは忘れよう

hikettei · 2024-08-27T06:40:17Z

どっかにまとめる
Tensor/Loop Boundに対する制約

Shape: Symbolic/Tensor/FixnumがOK
Stride: Symbolic/FixnumがOK(原則自動生成)
Permute: 全てFixnum
View: ByのみFixnum (か，Byを含むViewはScheduleしない)
その代わりLisp-Like DSLで直接コードを記述できるか，Lexographical Memory Accessingを許容

…ic compilation

hikettei · 2024-08-27T11:35:37Z

merge this as sbcl passed

hikettei added 15 commits August 23, 2024 13:13

WIP: Group into multiple groups first

3856919

ngrouping mechanism

8c3bcc9

enhancement: allocate relocation

369c37f

symbolic compilation simplified

6976d6e

update: relocate view computations

610cbac

print-object expr

0abba6a

update

0a76cce

compiling scalar softmax

00b9a85

max(z)

9d65eab

WIP: Single Loop Single Kernel RUle

26dc167

Fix: Single Function Single Loop

f37264c

Tweaking

07005e1

infer broadcasts, fix naming

819549a

commit before refactor

09d888e

fix: renderer/prune unused args

4dbcb44

Refactor and eliminated unused vars

3ed9779

hikettei changed the title ~~Feature/tmpvar jit~~ [JIT] Kernel Creation, Improving memory-planner, etc... Aug 24, 2024

hikettei added 11 commits August 24, 2024 18:22

typooooooooooo T_T

c929012

Add: purge unused allocation, (needs to update fw-outputs)

5d36de3

no side effects are allowed

e81c100

Fix: buffers created by jit

97a83f8

Push the failed-to-in-place tensor

60ba411

forgot to nrev

e8329c0

Enhancement: Rendering-Graph Level Simplifying Process

00fe944

Enhancement: failed-to-inplace computation considers the time-series

3b47275

compiling rand

014c917

optimize the gcc-calling time

bafee30

Fix: relocate allocs most related place

b69deda

hikettei added 16 commits August 25, 2024 16:10

Fix

fe6ac82

fix in place

c49969c

remove duplicates

22fd98f

fix splitting the kernels

cb9a893

cast

fd9f1ca

fix: inlined t is not a shape

6c74b70

fix: UPPER_CASE symbol

10b8324

fix: uppercase

949e688

newid

6bb3e80

dont split w/ if

6662928

fix or

56ba6d9

fix: refcoutn <= 0

054ce36

Fix: double-float

6271ddf

Split memory-loading process

7efa404

Fix: split only scalar loading (due to index computation)

bac8ab8

more detail

a932379

hikettei added 7 commits August 27, 2024 17:03

Refactor: Introduced Permute Inferencer, ready to support full symbol…

bec722d

…ic compilation

Fix for symbolic compilation indexing deps

770b6c7

replace funcall

6645df9

pay attention for time-series dependencies

0e45dc4

tweaking on the scalar scheduling

e5e1393

fix for index-component dependencies

68bc864

patch for symbolic by

b176984

hikettei marked this pull request as ready for review August 27, 2024 11:11

hikettei added 3 commits August 27, 2024 20:15

no symbolic by

a9a9ca9

updated jit tests

81c1155

patch for ecl

eda9957

hikettei merged commit 069a40c into main Aug 27, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JIT] Kernel Creation, Improving memory-planner, etc... #32

[JIT] Kernel Creation, Improving memory-planner, etc... #32

hikettei commented Aug 23, 2024 •

edited

Loading

hikettei commented Aug 24, 2024 •

edited

Loading

hikettei commented Aug 24, 2024

hikettei commented Aug 27, 2024 •

edited

Loading

hikettei commented Aug 27, 2024

[JIT] Kernel Creation, Improving memory-planner, etc... #32

[JIT] Kernel Creation, Improving memory-planner, etc... #32

Conversation

hikettei commented Aug 23, 2024 • edited Loading

hikettei commented Aug 24, 2024 • edited Loading

hikettei commented Aug 24, 2024

hikettei commented Aug 27, 2024 • edited Loading

hikettei commented Aug 27, 2024

hikettei commented Aug 23, 2024 •

edited

Loading

hikettei commented Aug 24, 2024 •

edited

Loading

hikettei commented Aug 27, 2024 •

edited

Loading