Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Text Generation] Internal KV Cache Support + Initial Testing Framework #1163

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Aug 3, 2023

Aggregates:

Testing scope:

  1. We are testing OPT and CodeGen
  2. Every test is repeated for a long prompt (which should end up being processed by a multi-token engine) and a short prompt (which should end up being processed by a single token engine)
  3. We test whether the pipeline correctly recognizes whether tokenizer adds a BOS token or not
  4. We test the match between the predicted output (strings) from the deepsparse pipeline vs the predicted output from hugging space pipeline
  5. We test whether the kv cache state agrees after processing the same prompt for:
  • deepsparse pipeline
  • model with kv cache ran in onnx runtime

dbogunowicz and others added 6 commits August 3, 2023 13:42
* initial commit

* initial commit

* [Text Generation][Tests] Text Generation Pipeline (#1162)

* initial implementation

* problems with multitoken prefill

* almost there...

* finally all tests pass

* just need to change to stub

* fix bad merge
…disable heavy tests, include patch for running without causal mask
@dbogunowicz dbogunowicz changed the base branch from main to feature/damian/backward_comp_no_causal_mask_models August 8, 2023 09:16
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbogunowicz LGTM overall - let's update the PR description to include which scenarios are tested in the current framework

* fix kv cache

* refactor

* add validation pathway

* avx2 support

* initial commit

* initial commit

* initial implementation

* problems with multitoken prefill

* its working

* almost there...

* finally all tests pass

* just need to change to stub

* fix bad merge

* added some tests

* ready for review

* full support

---------

Co-authored-by: dbogunowicz <[email protected]>
Co-authored-by: Damian <[email protected]>
Copy link

@Satrat Satrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment about a failing test, but the error doesn't seem to be actually related to your code, so LGTM :)

@bfineran bfineran changed the title [Text Generation][Tests] Feature Branch [Text Generation] Internal KV Cache Support + Initial Testing Framework Aug 10, 2023
@bfineran bfineran merged commit 43e70a5 into feature/damian/backward_comp_no_causal_mask_models Aug 10, 2023
@bfineran bfineran deleted the feature/damian/fb_testing branch August 10, 2023 16:53
bfineran added a commit that referenced this pull request Aug 10, 2023
…itial testing framework (#1172)

* initial commit

* improved logic

* additional improvements

* Update src/deepsparse/transformers/pipelines/text_generation.py

* Update src/deepsparse/utils/onnx.py

Co-authored-by: Benjamin Fineran <[email protected]>

* Update src/deepsparse/utils/onnx.py

Co-authored-by: Benjamin Fineran <[email protected]>

* response to Ben's comments

* finish rebasing

* update user messages + add assertion for safety

* minor improvements before landing

* Fix the helper function that has been broken after a merge

* [Text Generation] Internal KV Cache Support + Initial Testing Framework (#1163)

* Create test_nl_decoder_engine.py

* [Text Generation][Tests] DecoderKVCache (#1154)

* [Text Generation][Tests] NLDecoderEngine (#1155)

* initial commit

* initial commit

* [Text Generation][Tests] Text Generation Pipeline (#1162)

* initial implementation

* problems with multitoken prefill

* almost there...

* finally all tests pass

* just need to change to stub

* fix bad merge

* Make tests work with stub (as much as possible), cleanup test names,  disable heavy tests, include patch for running without causal mask

* use patch from unittest library - remove additional dependency

* Update tests/deepsparse/transformers/pipelines/test_text_generation.py

* clarify todo comment

* [Text Generation]  KV Cache internal Deepsparse support (#1135)

* fix kv cache

* refactor

* add validation pathway

* avx2 support

* initial commit

* initial commit

* initial implementation

* problems with multitoken prefill

* its working

* almost there...

* finally all tests pass

* just need to change to stub

* fix bad merge

* added some tests

* ready for review

* full support

---------

Co-authored-by: dbogunowicz <[email protected]>
Co-authored-by: Damian <[email protected]>

* incomplete string in parametrize

* few nits before the merge

---------

Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: Sage Moore <[email protected]>

---------

Co-authored-by: Benjamin Fineran <[email protected]>
Co-authored-by: Sage Moore <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants