Refactor: MultiHeadAttention #226

hikettei · 2024-11-15T10:20:42Z

TODO

MultiHeadAttention: Compare the result against PyTorch's one
Stable for larger inputs
at least no segv?
- schedule cache is not well tested
- memory planner?
- clang: due to restrict option?
no indexing with JIT=0
(maybe) need to fix memory planner I guess...

一つはaasmの段階であって，それを直したら全部動くはず

hikettei · 2024-11-15T11:09:54Z

struggling with low accuracy of gemm ... with both VM and jit

Refactor: More clean mha implementation

fd93220

hikettei mentioned this pull request Nov 15, 2024

Workload: Finally GPT2 Inference #196

Closed

28 tasks

Found an isolated point for both JIT and VM

31168c9

hikettei added 6 commits November 15, 2024 20:13

So the input is bad?

e3717a8

mha

613a598

using the same softmax

01f729b

Final

a6660a4

comment

e80be17

Giving up

08b04ef

hikettei changed the title ~~Final: MultiHeadAttention~~ MultiHeadAttention Nov 15, 2024

hikettei changed the title ~~MultiHeadAttention~~ Refactor: MultiHeadAttention Nov 15, 2024

hikettei marked this pull request as ready for review November 15, 2024 13:19

hikettei merged commit 37b8c4a into main Nov 15, 2024
6 checks passed

hikettei deleted the multi-head-attention branch November 15, 2024 13:43