Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MFM-2025-02-03] Merge Main to llama fp8; With Faster ROCm Paged Attention #399

Merged
merged 752 commits into from
Feb 3, 2025

Conversation

tjtanaa
Copy link

@tjtanaa tjtanaa commented Feb 3, 2025

Important feature to this branch

ywang96 and others added 30 commits January 12, 2025 06:36
…roject#11100)

Signed-off-by: Akshat Tripathi <[email protected]>
Signed-off-by: Oleg Mosalov <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Co-authored-by: Oleg Mosalov <[email protected]>
Co-authored-by: Jee Jee Li <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
* Commiting the *multilingual* P3L test.

* Created a *multi-lingual* P3L test.

* Making ruff happy.

* .

* Added a reference to the language-scripture Confluence table.

* Typo fixing.

* Harmonizing naming.

* Fixing comments in the header.

---------

Co-authored-by: Alexei V. Ivanov <[email protected]>
Co-authored-by: Gregory Shtrasberg <[email protected]>
tlrmchlsmth and others added 28 commits January 26, 2025 19:59
Signed-off-by: Bowen Wang <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Co-authored-by: youkaichao <[email protected]>
* Support FP8 FA from Quark format

* Support FP8 FA from Quark format

* nit: update comment
* updating code blocks

* typo

* updated manifest

* Including feedback

* whitespace

* Deepseek instructions

* hyperlink fix

* hyperlink fix

* updating what is new

* cpx update

* typo

* whitespace

* whitespace
* integrate new cpa kernel, update tests and benchmark

* added comments to mfma4 kernel

* further comments for mfma16 kernel

* clang-format

* Lint

* add flag for logits rtz conversion and disable by default

* lint

* [Bugfix]: Fix paged attention unit tests of ROCm#372 (ROCm#389)

* [Bugfix]: fix paged attention tests based on the updated kernels in `csrc/attention/paged_attention_v1.cu`,`csrc/attention/paged_attention_v2.cu` and  `csrc/rocm/attention.cu`.

* improve code documentation.

* lint

---------

Co-authored-by: vllmellm <[email protected]>

---------

Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: Gregory Shtrasberg <[email protected]>
Co-authored-by: Joe Shajrawi <[email protected]>
Co-authored-by: TJian <[email protected]>
Co-authored-by: vllmellm <[email protected]>
Signed-off-by: vllmellm <[email protected]>
@hongxiayang hongxiayang merged commit 479b843 into ROCm:llama_fp8_12062024 Feb 3, 2025
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.