-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for anthropic, azure, aleph alpha, ai21, togetherai, cohere, replicate, huggingface inference endpoints, etc. #660
Conversation
@ErikBjare what do you think of this? It feels like it's too much have both:
We should simplify and pick a subset (maybe just one?) of all of these! |
@AntonOsika happy to make any changes as necessary 🙂 |
Thought about it and I think we should go for it! @krrishdholakia Do we still need the special azure parameters after this? Would be great to have that taken care of by env variables + litellm instead of code bloat! 🚀 |
Hey @krrishdholakia I got this and reverted: Do you have any ideas why? |
What version of langchain were you using? |
Hey @AntonOsika here's chatlitellm working for replicate and cohere Here it is for OpenAI + Anthropic Can you please let me know the version you're using so i can debug this? |
…re, replicate, huggingface inference endpoints, etc. (AntonOsika#660) * fix streaming bug * fixing streaming bug and adding litellm to pyproject.toml * adding provider details to env template
…ai, cohere, replicate, huggingface inference endpoints, etc. (AntonOsika#660)" (AntonOsika#685) This reverts commit 9079f84.
How do someone use GPT 3.5 turbo Fine tuned models in it? |
Hi @AntonOsika,
Following up on this PR - #574.
I fixed the streaming bugs and confirmed it works with azure.
Here is the gpt-engineer working with my deployed azure instance:
Here is the gpt-engineer working with ai21's "j2-mid" model:
For those trying to use WizardCoder / Phind-CodeLlama, we (+ this PR) also provides support for Huggingface Inference Endpoints and Baseten.
https://docs.litellm.ai/docs/completion/supported
If anyone's using gpt-engineer in production, and would like to split traffic between a finetuned model and gpt-4 - they can do that too: https://docs.litellm.ai/docs/tutorials/ab_test_llms
Please let me know if this PR looks good or you'd still prefer to stash it.
Happy to make any changes / update docs, if the initial PR looks good.