Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Cohere models #203

Merged
merged 5 commits into from
Jan 21, 2025
Merged

Conversation

rafidka
Copy link
Contributor

@rafidka rafidka commented Dec 10, 2024

Adding support for Cohere models. Currently, only non-streaming chat is supported.

@rafidka
Copy link
Contributor Author

rafidka commented Dec 10, 2024

Hi Pydantic-AI Maintainers,

Cohere employee here. I really am excited about this new library from Pydantic, and I love to see Cohere models added as well. I published this WIP PR to help making this happen. It currently only supports non-streaming chat, but if I get initial blessing on your willingness to add Cohere and the direction I am going on in this direction, then I am happy to update the PR and add streaming support. Looking forward to hearing from you!

Thanks,
Rafid

@rafidka rafidka force-pushed the feat/support-cohere branch from 2e49958 to 28143f3 Compare December 10, 2024 16:08
@samuelcolvin
Copy link
Member

Hi @rafidka, thanks so much for the contribution and sorry for the slow reply. As per #239 we'd definitely be happy to add support for Cohere!

Note: we've made a few changes to internal APIs since you started this PR, let us know if you have any questions or need any help, you can always join our slack if you want more immediate help.

@rafidka
Copy link
Contributor Author

rafidka commented Dec 14, 2024

@samuelcolvin , thank you, great to hear you are willing to integrate Cohere with your library <3

BTW, we announced another model yesterday, which is "provides state-of-the-art performance in its class of open-weights models".

@rafidka rafidka force-pushed the feat/support-cohere branch from 28143f3 to d763c28 Compare December 14, 2024 22:18
@rafidka
Copy link
Contributor Author

rafidka commented Dec 14, 2024

Note: we've made a few changes to internal APIs since you started this PR, let us know if you have any questions or need any help, you can always join our slack if you want more immediate help.

I updated my PR to work with the new changes.

Question: would you want to have streaming support as well in the same PR, or you are ok breaking this down into two PRs?

@corafid
Copy link

corafid commented Dec 20, 2024

Any update on this?

@samuelcolvin
Copy link
Member

Question: would you want to have streaming support as well in the same PR, or you are ok breaking this down into two PRs?

I'm happy for streaming to come in a follow up pr.

@rafidka
Copy link
Contributor Author

rafidka commented Dec 21, 2024

I'm happy for streaming to come in a follow up pr.

Great! I will rebase the PR again then and hopefully can get it merged so I do the streaming next.

@rafidka rafidka force-pushed the feat/support-cohere branch from d763c28 to 8defdbb Compare December 25, 2024 20:20
@rafidka rafidka changed the title [WIP] Early support for Cohere models Early support for Cohere models Dec 25, 2024
@rafidka rafidka force-pushed the feat/support-cohere branch from 8defdbb to dfb7326 Compare December 25, 2024 20:22
@rafidka
Copy link
Contributor Author

rafidka commented Dec 25, 2024

@samuelcolvin , I rebased the PR and made the necessary changes based on your recent changes. I would like to get this merged soon if possible, to avoid having to rebase again since this is a quickly evolving code base (I had to refactor the code 2-3 times so far.) Thank you!

@rafidka rafidka changed the title Early support for Cohere models Add support for Cohere models Dec 27, 2024
Copy link
Contributor

@dmontagu dmontagu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need tests of this model in a tests/models/test_cohere.py, similar to files used to test the other models in that folder.

We should also update the docs to mention Cohere where appropriate; at the very least, docs/index.md and README.md where it says Supports OpenAI, Anthropic, Gemini, Ollama, Groq, and Mistral, and we should also add a docs/api/models/cohere.md file similar to the docs files for the other models in that folder (and ensure it gets included in the docs build, and referenced where appropriate).

Comment on lines 71 to 75

"""
Using this more broad type for the model name instead of the ChatModel definition
allows this model to be used more easily with other model types (ie, Ollama)
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""
Using this more broad type for the model name instead of the ChatModel definition
allows this model to be used more easily with other model types (ie, Ollama)
"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address.

"you can use the `cohere` optional group — `pip install 'pydantic-ai-slim[cohere]'`"
) from _import_error

type CohereModelName = Union[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support all the way back to Python 3.9; this syntax only works in 3.12+. You should instead use

from typing_extensions import TypeAlias

...

CohereModelName: TypeAlias = Union[
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address.

if response.message.content is not None and len(response.message.content) > 0:
choice = response.message.content[0]
parts.append(TextPart(choice.text))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that if cohere returns a longer-than-1 list of AssistantMessageResponseContentItem that the messages items after the first are choices? Just want to explicitly confirm that it doesn't return multiple AssistantMessageResponseContentItems that are all intended as part of a single response "choice". (Anthropic does that, but tool calls are also content items in their API.)

(If it did, we'd just want to loop over the items rather than just taking the first one. But if the list functions like choices then this makes sense.)

This wasn't clear to me based on reading https://docs.cohere.com/v2/reference/chat — it doesn't seem to explain why this field is a list. It just seems slightly odd to me that the content part of the response would be a list of choices when the tool_calls part of the response is treated as a list of items that all apply to the response, rather than choices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't clear to me based on reading https://docs.cohere.com/v2/reference/chat — it doesn't seem to explain why this field is a list. It just seems slightly odd to me that the content part of the response would be a list of choices when the

You are right, it does seem a bit strange initially. I asked internally about this and the reason behind returning a list rather than a single item is future proofing.

So, for now, we should assume we only need the first item. I will add this in comment for reference.

if texts:
# Note: model responses from this model should only have one text item, so the following
# shouldn't merge multiple texts into one unless you switch models between runs:
message_param.content = '\n\n'.join(texts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that if content being a list does NOT imply choices (as mentioned above), this could be converted to:

Suggested change
message_param.content = '\n\n'.join(texts)
message_param.content = [cohere.types.assistant_message_content_item.TextAssistantMessageContentItem(text=text) for text in texts]

But if it does, then the current implementation makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my response above, I changed this to:

message_param.content = [TextAssistantMessageContentItem(text='\n\n'.join(texts))]

Let me know if you think this doesn't make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be better to keep it as a list rather than join, but given your point that we should only expect to get a single item, it's presumably an essentially non-breaking change to change this in the future if we want to. So I'm okay either way; if your intuition is that it's better to join messages with a double-newline, then let's just keep it like this for now.

@rafidka rafidka force-pushed the feat/support-cohere branch from dfb7326 to 4b4827e Compare January 19, 2025 18:25
@dmontagu
Copy link
Contributor

I'm going to merge this now so that if nothing else you don't need to continue dealing with changes to other APIs, etc., but if there's anything to do that isn't done yet or needs "fixing", obviously feel free to open another PR making changes as appropriate.

@dmontagu dmontagu merged commit 331ea8d into pydantic:main Jan 21, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants