-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenAI API JSON formatted #995
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/995
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 0d3a5c3 with merge base a3bf37d (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Can you add a video for non-chunked as well for record purposes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, but the content lgtm
Remember to verify the behavior lines up with how people plan on using OpenAI
self.system_fingerprint = ( | ||
self.builder_args.device + type(self.builder_args.precision).__name__ | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment that this field doesn't match the spec, but is populated
We'll fix in a separate PR
server.py
Outdated
nextok = chunk.choices[0].delta.content | ||
nextok = nextok if nextok is not None else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nextok = chunk.choices[0].delta.content | |
nextok = nextok if nextok is not None else "" | |
if (next_tok := chunk.choices[0].delta.content) is None: | |
next_tok = "" |
8472f6d
to
975a817
Compare
975a817
to
6401f55
Compare
6401f55
to
4e26b22
Compare
Implement JSON formatted responses using OpenAI API types for server completion requests. Rather than giving single tokens at a time, the server will respond with a JSON following the API dataclasses corresponding to OpenAI API types.
Testing:
Server
Request (chunked)
Request (synchronous)
Screen.Recording.2024-08-02.at.1.22.21.PM.mov