-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Text Generation] Internal KV Cache Support + Initial Testing Framework #1163
[Text Generation] Internal KV Cache Support + Initial Testing Framework #1163
Conversation
* initial commit * initial commit * [Text Generation][Tests] Text Generation Pipeline (#1162) * initial implementation * problems with multitoken prefill * almost there... * finally all tests pass * just need to change to stub * fix bad merge
…disable heavy tests, include patch for running without causal mask
…causal_mask_models' into feature/damian/fb_testing
…o feature/damian/fb_testing
…o feature/damian/fb_testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dbogunowicz LGTM overall - let's update the PR description to include which scenarios are tested in the current framework
…o feature/damian/fb_testing
tests/deepsparse/transformers/pipelines/test_text_generation.py
Outdated
Show resolved
Hide resolved
tests/deepsparse/transformers/pipelines/test_text_generation.py
Outdated
Show resolved
Hide resolved
tests/deepsparse/transformers/pipelines/test_text_generation.py
Outdated
Show resolved
Hide resolved
* fix kv cache * refactor * add validation pathway * avx2 support * initial commit * initial commit * initial implementation * problems with multitoken prefill * its working * almost there... * finally all tests pass * just need to change to stub * fix bad merge * added some tests * ready for review * full support --------- Co-authored-by: dbogunowicz <[email protected]> Co-authored-by: Damian <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment about a failing test, but the error doesn't seem to be actually related to your code, so LGTM :)
…itial testing framework (#1172) * initial commit * improved logic * additional improvements * Update src/deepsparse/transformers/pipelines/text_generation.py * Update src/deepsparse/utils/onnx.py Co-authored-by: Benjamin Fineran <[email protected]> * Update src/deepsparse/utils/onnx.py Co-authored-by: Benjamin Fineran <[email protected]> * response to Ben's comments * finish rebasing * update user messages + add assertion for safety * minor improvements before landing * Fix the helper function that has been broken after a merge * [Text Generation] Internal KV Cache Support + Initial Testing Framework (#1163) * Create test_nl_decoder_engine.py * [Text Generation][Tests] DecoderKVCache (#1154) * [Text Generation][Tests] NLDecoderEngine (#1155) * initial commit * initial commit * [Text Generation][Tests] Text Generation Pipeline (#1162) * initial implementation * problems with multitoken prefill * almost there... * finally all tests pass * just need to change to stub * fix bad merge * Make tests work with stub (as much as possible), cleanup test names, disable heavy tests, include patch for running without causal mask * use patch from unittest library - remove additional dependency * Update tests/deepsparse/transformers/pipelines/test_text_generation.py * clarify todo comment * [Text Generation] KV Cache internal Deepsparse support (#1135) * fix kv cache * refactor * add validation pathway * avx2 support * initial commit * initial commit * initial implementation * problems with multitoken prefill * its working * almost there... * finally all tests pass * just need to change to stub * fix bad merge * added some tests * ready for review * full support --------- Co-authored-by: dbogunowicz <[email protected]> Co-authored-by: Damian <[email protected]> * incomplete string in parametrize * few nits before the merge --------- Co-authored-by: Benjamin Fineran <[email protected]> Co-authored-by: Sage Moore <[email protected]> --------- Co-authored-by: Benjamin Fineran <[email protected]> Co-authored-by: Sage Moore <[email protected]>
Aggregates:
Testing scope: