-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TextGeneration] Add Streaming Functionality #1246
Conversation
115a7d7
to
6a79bca
Compare
b610646
to
a69abf9
Compare
627b849
to
ce85c4b
Compare
tests/deepsparse/transformers/pipelines/test_text_generation.py
Outdated
Show resolved
Hide resolved
Something maybe parallel to @mgoin mentions, one thing that I find a bit problematic is this: created=datetime.datetime(2023, 9, 18, 15, 32, 9, 279348) prompts=['hello?', 'cool'] generations=[GeneratedText(text="'ll", score=None, finished=False, finished_reason=None)] session_id=None
created=datetime.datetime(2023, 9, 18, 15, 32, 9, 423291) prompts=['hello?', 'cool'] generations=[GeneratedText(text=' be', score=None, finished=False, finished_reason=None)] session_id=None The "lll" and "be" tokens, we do not know whether they came from "hello?" or "cool" prompt. For now, the user does not know the relationship between the generated token and the original prompt (which egg was hatched by which chicken in plain english :)). Clearly, it's not straightforward to resolve if there is no unique identifier like session_ids to convey this information... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - let's check in some tests for this as well
1dc2bd1
to
3a18e5f
Compare
Co-authored-by: Rahul Tuli <[email protected]>
Co-authored-by: Rahul Tuli <[email protected]>
Summary:
streaming
as a boolean flag as an input which then toggles the classstreaming
flagGeneratedText
which also includes whether or not the stream is complete with thefinish
flagTesting:
Locally Testing the following pipeline runs, with and without streaming:
Without streaming:
Output:
With streaming:
Output: