Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KVCache] Fix attention prefill kernel for Metal and Android #17539

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

MasterJH5574
Copy link
Contributor

This PR fixes two bugs of the attention prefill ragged kernel.

  • The first bug is the unroll of loop ki, which causes the TIR build failure in the PointerValueTypeRewrite pass due to vector size.
  • The second is the tile sizes of tile_z and tile_y may violate the assertion check in get_tile_size.

This PR fixes two bugs of the attention prefill ragged kernel.

* The first bug is the unroll of loop `ki`, which causes the TIR build
failure in the PointerValueTypeRewrite pass due to vector size.
* The second is the tile sizes of `tile_z` and `tile_y` may violate
the assertion check in `get_tile_size`.
@tqchen tqchen merged commit 42b1e97 into apache:main Nov 21, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants