-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Anthropic streaming support #652
base: dmontagu/refactor-streaming
Are you sure you want to change the base?
Anthropic streaming support #652
Conversation
tests/models/test_anthropic.py
Outdated
class MockAsyncStream(AsyncStream[T]): | ||
"""Mock implementation of AsyncStream for testing.""" | ||
|
||
def __init__(self, events: list[list[T]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fulfill the contract of AsyncAnthropic, we need to pass create_mock
a single instance of our MockAsyncStream. However since this async stream is recycled across a potentially multi-turn conversation with tool calls, we also need the ability to provide multiple chained sequences together.
We modify the existing MockAsyncStream class here to support list[list[Messages]] instead of a single event stream. If desired this change could be refactored into the original class.
ae69e4e
to
ed42efb
Compare
Thanks! Excited to review this. We're working on getting streaming across the line here soon. |
9e3fa70
to
256a3d9
Compare
@sydney-runkle Rebased off of @dmontagu's latest changes and have tests passing here. Should be ready for a first review. |
Branching off @dmontagu's pending streaming refactor in #468. I'll rebase this off of main once that work is merged in.
This PR adds support for streaming responses from Anthropic's API. This is a pretty straightforward implementation inline with OpenAI and Grok handlers, with the exception of having to locally assemble the json arguments to functions progressively - since Claude architectures can yield function inputs across multiple messages. The JSON is not actually valid until the last delta message.