-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some improvements for KV caching #1891
Some improvements for KV caching #1891
Conversation
69d6d6f
to
a65a96d
Compare
Can somebody help with failing tests? I don't understand why tests for Windows are failing, but pass for all other systems. And I also don't understand why the GPU tests are failing. |
Hello @mseeger Thank you for another PR.
yeah, there is always something with Windows.
I'll check it tomorrow. |
Hello @mseeger It's quite a PR 🫠 🙂. (I'll take a look why GPU tests are failing later.) |
7f2c2ce
to
3226323
Compare
OK, I reacted to comments. I also did a small change in |
Cool, we are almost there 🙂. On my side I'll try to find and fix issues with failing GPU tests, hopefully this year 😃. |
- Shrink buffers returned by KVCache to just cover input_pos entries - Refactor child classes of model.py classes to avoid copy and paste
3226323
to
3702b03
Compare
Overall, the issue with GPU+Thunder is something specific to the latter. Thanks again for the PR (and for the patience 😊). Happy New Year! 🚀 |
model.py
, in particular removeforward
copies