Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Text-Generation] Set kv cache inputs to empty arrays (size 0) when r…
…unning internally (#1195) * fix kv cache * refactor * add validation pathway * avx2 support * initial commit * initial commit * initial implementation * problems with multitoken prefill * its working * Create test_nl_decoder_engine.py * almost there... * finally all tests pass * just need to change to stub * fix bad merge * added some tests * ready for review * [Text Generation][Tests] DecoderKVCache (#1154) * [Text Generation][Tests] NLDecoderEngine (#1155) * initial commit * initial commit * [Text Generation][Tests] Text Generation Pipeline (#1162) * initial implementation * problems with multitoken prefill * almost there... * finally all tests pass * just need to change to stub * fix bad merge * Make tests work with stub (as much as possible), cleanup test names, disable heavy tests, include patch for running without causal mask * initial commit * use patch from unittest library - remove additional dependency * improved logic * additional improvements * Update src/deepsparse/transformers/pipelines/text_generation.py * Update src/deepsparse/utils/onnx.py Co-authored-by: Benjamin Fineran <[email protected]> * Update src/deepsparse/utils/onnx.py Co-authored-by: Benjamin Fineran <[email protected]> * response to Ben's comments * finish rebasing * full support * Update tests/deepsparse/transformers/pipelines/test_text_generation.py * initial commit * clarify todo comment * update user messages + add assertion for safety * [Text Generation] KV Cache internal Deepsparse support (#1135) * fix kv cache * refactor * add validation pathway * avx2 support * initial commit * initial commit * initial implementation * problems with multitoken prefill * its working * almost there... * finally all tests pass * just need to change to stub * fix bad merge * added some tests * ready for review * full support --------- Co-authored-by: dbogunowicz <[email protected]> Co-authored-by: Damian <[email protected]> * minor improvements before landing * Fix the helper function that has been broken after a merge * incomplete string in parametrize * few nits before the merge * pass dummy cache if internal cache management supported * Apply suggestions from code review * add missing property * cleaner func * PR ready * initial commit * code review comments * set kv cache inputs to empty arrays (size 0) when running internally * TEMP: removing inputs filtering by name * remove obsolete argument * trying to find a solution * this is working * improve documentation * review comments * inline comment instead of warning --------- Co-authored-by: Sage Moore <[email protected]> Co-authored-by: dbogunowicz <[email protected]> Co-authored-by: Damian <[email protected]> Co-authored-by: Luka Govedic <[email protected]>
- Loading branch information