[Bugfix] Remove hardcoded head_size=256
for Deepseek v2 and v3 (#12…
#252
Job | Run time |
---|---|
6s | |
6s |
head_size=256
for Deepseek v2 and v3 (#12…
#252
Job | Run time |
---|---|
6s | |
6s |