-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Unity] Support TIR kernel for PagedKVCache
This PR supports PagedKVCache with leveraging TIR kernels. Right now we do not have sufficient TIR kernels for multi-level sequences in PagedKVCache, therefore `Fork` in PagedKVCache is disabled when such a function does not exist. This PR adds a "reduced" creator of PagedKVCache, where some auxiliary functions such as the begin/end forward function of prefill/decode default to None. CUDA tests are added to ensure correctness. Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Bohan Hou <[email protected]>
- Loading branch information
1 parent
474c06b
commit 1603a90
Showing
3 changed files
with
1,149 additions
and
40 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.