Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Sage Attention Support for Triton kernel #929

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

l1cacheDell
Copy link

@l1cacheDell l1cacheDell commented Dec 27, 2024

Support sage attention Triton kernel

so far stagely support is_casual=False situation. is_casual=True will be coming soon...

@CLAassistant
Copy link

CLAassistant commented Dec 27, 2024

CLA assistant check
All committers have signed the CLA.

Copy link

paddle-bot bot commented Dec 27, 2024

Thanks for your contribution!

@l1cacheDell l1cacheDell changed the title fix CLAPAudioCfg assertion error [Feat] Sage Attention Support for Triton kernel Jan 2, 2025
@l1cacheDell l1cacheDell marked this pull request as draft January 2, 2025 11:59
@l1cacheDell l1cacheDell marked this pull request as ready for review January 5, 2025 08:01
PD_BUILD_OP(${op_name})
.Inputs({"x", "k_tensor", "v_tensor", "q_scale", "k_scale"})
.Outputs({"out_tensor", "lse_tensor"})
.Attrs({"output_dtype: std::string", "tensor_layout: std::string", "return_lse: int"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lse需要吗?如果确认推理不需要的话是否可以删掉呢?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多卡并行推理可能需要打开,建议保留。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants