Support FP32 #72

WoosukKwon · 2023-05-05T09:00:50Z

          Yes, it does. It is our attention kernel that does not support FP32. More precisely, our attention kernel currently does not support some block sizes when FP32 is used. I will fix this in the future.

Originally posted by @WoosukKwon in #70 (comment)

The text was updated successfully, but these errors were encountered:

SUMMARY: * Renamed forked LICENSE to LICENSE-APACHE * Updated LICENSE to include new Neural Magic Community License ** will need help in having metadata not say Apache at top of GitHub window on main branch as original was forked and it's inheriting this invisible header on jfinks-license branch * Created NOTICE to be consistent with Neural Magic repo content TEST PLAN: Content reviewed offline with Brian and Rob Shaw; however, need to resolve final LICENSE file display so it does not indicate Apache license type

This reverts commit 2a3cbf9, reversing changes made to 367aa5a.

remove expert_max hard code (vllm-project#47) vLLM-Ext: Full enabling of ALiBi (vllm-project#34) Add version inference via setuptools-scm (vllm-project#58) Revert "vLLM-Ext: Full enabling of ALiBi (vllm-project#34)" (vllm-project#59) Remove punica_hpu.py from vllm_hpu_extension (vllm-project#66) Removed previous (not-pipelined) pa implementation (vllm-project#72) Add flag to enable running softmax in fp32 (vllm-project#71) Update calibration readme link (vllm-project#73) allow lm_head quantization in calibration process (vllm-project#65) Pad to bmin if value is less (vllm-project#67) Update pyproject.toml (HabanaAI#75) --------- Co-authored-by: Michał Kuligowski <[email protected]>

WoosukKwon added new model Requests to new models and removed new model Requests to new models labels May 11, 2023

WoosukKwon mentioned this issue Jun 6, 2023

Support FP32 #141

Merged

WoosukKwon closed this as completed in #141 Jun 7, 2023

dllehr-amd pushed a commit to dllehr-amd/vllm that referenced this issue Jul 22, 2024

Revert "Merge branch 'main' of github.com:ROCm/vllm" (vllm-project#72)

596d58c

This reverts commit 2a3cbf9, reversing changes made to 367aa5a.

JHLEE17 pushed a commit to JHLEE17/vllm that referenced this issue Aug 1, 2024

Update ops.py (vllm-project#72)

2728599

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support FP32 #72

Support FP32 #72

WoosukKwon commented May 5, 2023

Support FP32 #72

Support FP32 #72

Comments

WoosukKwon commented May 5, 2023