-
Notifications
You must be signed in to change notification settings - Fork 186
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
allows better flexibility for litellm endpoints (#549)
* allows better flexibility for litellm * add config file * add doc * add doc * add doc * add doc * add lighteval imgs * add lighteval imgs * add doc * Update docs/source/use-litellm-as-backend.mdx Co-authored-by: Victor Muštar <[email protected]> * Update docs/source/use-litellm-as-backend.mdx --------- Co-authored-by: Victor Muštar <[email protected]>
- Loading branch information
Showing
6 changed files
with
168 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# Litellm as backend | ||
|
||
Lighteval allows to use litellm, a backend allowing you to call all LLM APIs | ||
using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, | ||
OpenAI, Groq etc.]. | ||
|
||
Documentation for available APIs and compatible endpoints can be found [here](https://docs.litellm.ai/docs/). | ||
|
||
## Quick use | ||
|
||
```bash | ||
lighteval endpoint litellm \ | ||
"gpt-3.5-turbo" \ | ||
"lighteval|gsm8k|0|0" | ||
``` | ||
|
||
## Using a config file | ||
|
||
Litellm allows generation with any OpenAI compatible endpoint, for example you | ||
can evaluate a model running on a local vllm server. | ||
|
||
To do so you will need to use a config file like so: | ||
|
||
```yaml | ||
model: | ||
base_params: | ||
model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" | ||
base_url: "URL OF THE ENDPOINT YOU WANT TO USE" | ||
api_key: "" # remove or keep empty as needed | ||
generation: | ||
temperature: 0.5 | ||
max_new_tokens: 256 | ||
stop_tokens: [""] | ||
top_p: 0.9 | ||
seed: 0 | ||
repetition_penalty: 1.0 | ||
frequency_penalty: 0.0 | ||
``` | ||
## Use Hugging Face Inference Providers | ||
With this you can also access HuggingFace Inference servers, let's look at how to evaluate DeepSeek-R1-Distill-Qwen-32B. | ||
First, let's look at how to acess the model, we can find this from [the model card](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B). | ||
Step 1: | ||
 | ||
Step 2: | ||
 | ||
Great ! Now we can simply copy paste the base_url and our api key to eval our model. | ||
> [!WARNING] | ||
> Do not forget to prepend the provider in the `model_name`. Here we use an | ||
> openai compatible endpoint to the provider is `openai`. | ||
|
||
```yaml | ||
model: | ||
base_params: | ||
model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" | ||
base_url: "https://router.huggingface.co/hf-inference/v1" | ||
api_key: "YOUR KEY" # remove or keep empty as needed | ||
generation: | ||
temperature: 0.5 | ||
max_new_tokens: 256 # This will overide the default from the tasks config | ||
top_p: 0.9 | ||
seed: 0 | ||
repetition_penalty: 1.0 | ||
frequency_penalty: 0.0 | ||
``` | ||
|
||
And then, we are able to eval our model on any eval available in Lighteval. | ||
|
||
```bash | ||
lighteval endpoint litellm \ | ||
"examples/model_configs/litellm_model.yaml" \ | ||
"lighteval|gsm8k|0|0" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
model: | ||
base_params: | ||
model_name: "openai/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B" | ||
base_url: "https://router.huggingface.co/hf-inference/v1" | ||
generation: | ||
temperature: 0.5 | ||
max_new_tokens: 256 | ||
stop_tokens: [""] | ||
top_p: 0.9 | ||
seed: 0 | ||
repetition_penalty: 1.0 | ||
frequency_penalty: 0.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters