Skip to content

Commit

Permalink
[BFCL] Add gemini-1.5-pro-002, gemini-1.5-pro-002-FC, gemini-1.5-pro-…
Browse files Browse the repository at this point in the history
…001, gemini-1.5-pro-001-FC, gemini-1.5-flash-002, gemini-1.5-flash-002-FC, gemini-1.0-pro-002, gemini-1.0-pro-002-FC (ShishirPatil#658)

This PR adds the following new models to the leaderboard:

  - `gemini-1.5-pro-002`
  - `gemini-1.5-pro-002-FC`
  - `gemini-1.5-pro-001`
  - `gemini-1.5-pro-001-FC`
  - `gemini-1.5-flash-002`
  - `gemini-1.5-flash-002-FC`
  - `gemini-1.0-pro-002`
  - `gemini-1.0-pro-002-FC`

Note: All the code for Gemini prompting series inference are already
written out in ShishirPatil#644, but they were not thoroughly verified at that time
due to limited bandwidth. This PR has the code tested and thus we
'officially' add support for the prompting version of the Gemini models.
  • Loading branch information
HuanzhiMao authored and VishnuSuresh27 committed Nov 11, 2024
1 parent e110fbc commit a94a4ce
Show file tree
Hide file tree
Showing 7 changed files with 292 additions and 115 deletions.
11 changes: 11 additions & 0 deletions berkeley-function-call-leaderboard/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,17 @@ All notable changes to the Berkeley Function Calling Leaderboard will be documen
- `microsoft/Phi-3-mini-128k-instruct`
- `microsoft/Phi-3-mini-4k-instruct`
- [Sept 25, 2024] [#660](https://github.com/ShishirPatil/gorilla/pull/660): Bug fix in `parse_nested_value` function to handle nested dictionary values properly.
- [Sept 24, 2024] [#648](https://github.com/ShishirPatil/gorilla/pull/648): Add the following new models to the leaderboard:
- `gemini-1.5-pro-002`
- `gemini-1.5-pro-002-FC`
- `gemini-1.5-pro-001`
- `gemini-1.5-pro-001-FC`
- `gemini-1.5-flash-002`
- `gemini-1.5-flash-002-FC`
- `gemini-1.5-flash-001`
- `gemini-1.5-flash-001-FC`
- `gemini-1.0-pro-002`
- `gemini-1.0-pro-002-FC`
- [Sept 19, 2024] [#644](https://github.com/ShishirPatil/gorilla/pull/644): BFCL V3 release:
- Introduce new multi-turn dataset and state-based evaluation metric
- Separate ast_checker and executable_checker for readability
Expand Down
9 changes: 6 additions & 3 deletions berkeley-function-call-leaderboard/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,12 @@ Below is _a table of models we support_ to run our leaderboard evaluation agains
|databrick-dbrx-instruct | Prompt|
|deepseek-ai/deepseek-coder-6.7b-instruct 💻| Prompt|
|firefunction-{v1,v2}-FC | Function Calling|
|gemini-1.0-pro-FC | Function Calling|
|gemini-1.5-pro-preview-{0409,0514}-FC | Function Calling|
|gemini-1.5-flash-preview-0514-FC | Function Calling|
|gemini-1.0-pro-{001,002}-FC | Function Calling|
|gemini-1.0-pro-{001,002} | Prompt|
|gemini-1.5-pro-{001,002}-FC | Function Calling|
|gemini-1.5-pro-{001,002} | Prompt|
|gemini-1.5-flash-{001,002}-FC | Function Calling|
|gemini-1.5-flash-{001,002} | Prompt|
|glaiveai/glaive-function-calling-v1 💻| Function Calling|
|gpt-3.5-turbo-0125-FC| Function Calling|
|gpt-3.5-turbo-0125| Prompt|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -263,27 +263,63 @@
"Fireworks",
"Apache 2.0",
],
"gemini-1.5-pro-preview-0514-FC": [
"Gemini-1.5-Pro-Preview-0514 (FC)",
"gemini-1.5-pro-002": [
"Gemini-1.5-Pro-002 (Prompt)",
"https://deepmind.google/technologies/gemini/pro/",
"Google",
"Proprietary",
],
"gemini-1.5-flash-preview-0514-FC": [
"Gemini-1.5-Flash-Preview-0514 (FC)",
"gemini-1.5-pro-002-FC": [
"Gemini-1.5-Pro-002 (FC)",
"https://deepmind.google/technologies/gemini/pro/",
"Google",
"Proprietary",
],
"gemini-1.5-pro-001": [
"Gemini-1.5-Pro-001 (Prompt)",
"https://deepmind.google/technologies/gemini/pro/",
"Google",
"Proprietary",
],
"gemini-1.5-pro-001-FC": [
"Gemini-1.5-Pro-001 (FC)",
"https://deepmind.google/technologies/gemini/pro/",
"Google",
"Proprietary",
],
"gemini-1.5-flash-002": [
"Gemini-1.5-Flash-002 (Prompt)",
"https://deepmind.google/technologies/gemini/flash/",
"Google",
"Proprietary",
],
"gemini-1.5-flash-002-FC": [
"Gemini-1.5-Flash-002 (FC)",
"https://deepmind.google/technologies/gemini/flash/",
"Google",
"Proprietary",
],
"gemini-1.5-pro-preview-0409-FC": [
"Gemini-1.5-Pro-Preview-0409 (FC)",
"https://deepmind.google/technologies/gemini/#introduction",
"gemini-1.5-flash-001": [
"Gemini-1.5-Flash-001 (Prompt)",
"https://deepmind.google/technologies/gemini/flash/",
"Google",
"Proprietary",
],
"gemini-1.0-pro-FC": [
"Gemini-1.0-Pro-001 (FC)",
"https://deepmind.google/technologies/gemini/#introduction",
"gemini-1.5-flash-001-FC": [
"Gemini-1.5-Flash-001 (FC)",
"https://deepmind.google/technologies/gemini/flash/",
"Google",
"Proprietary",
],
"gemini-1.0-pro-002": [
"Gemini-1.0-Pro-002 (Prompt)",
"https://deepmind.google/technologies/gemini/pro/",
"Google",
"Proprietary",
],
"gemini-1.0-pro-002-FC": [
"Gemini-1.0-Pro-002 (FC)",
"https://deepmind.google/technologies/gemini/pro/",
"Google",
"Proprietary",
],
Expand Down Expand Up @@ -539,10 +575,16 @@
"gpt-4-0613-FC": 30,
"gpt-3.5-turbo-0125": 0.5,
"gpt-3.5-turbo-0125-FC": 0.5,
"gemini-1.0-pro-FC": 0.5,
"gemini-1.5-pro-preview-0409-FC": 3.5,
"gemini-1.5-pro-preview-0514-FC": 3.5,
"gemini-1.5-flash-preview-0514-FC": 0.35,
"gemini-1.5-pro-002": 1.25,
"gemini-1.5-pro-002-FC": 1.25,
"gemini-1.5-pro-001": 1.25,
"gemini-1.5-pro-001-FC": 1.25,
"gemini-1.5-flash-002": 0.075 ,
"gemini-1.5-flash-002-FC": 0.075 ,
"gemini-1.5-flash-001": 0.075 ,
"gemini-1.5-flash-001-FC": 0.075 ,
"gemini-1.0-pro-002": 0.5,
"gemini-1.0-pro-002-FC": 0.5,
"databricks-dbrx-instruct": 2.25,
"command-r-plus-FC": 3,
"command-r-plus": 3,
Expand Down Expand Up @@ -591,10 +633,16 @@
"gpt-4-0613-FC": 60,
"gpt-3.5-turbo-0125": 1.5,
"gpt-3.5-turbo-0125-FC": 1.5,
"gemini-1.0-pro-FC": 1.5,
"gemini-1.5-pro-preview-0409-FC": 10.50,
"gemini-1.5-pro-preview-0514-FC": 10.50,
"gemini-1.5-flash-preview-0514-FC": 0.53,
"gemini-1.5-pro-002": 5,
"gemini-1.5-pro-002-FC": 5,
"gemini-1.5-pro-001": 5,
"gemini-1.5-pro-001-FC": 5,
"gemini-1.5-flash-002": 0.30,
"gemini-1.5-flash-002-FC": 0.30,
"gemini-1.5-flash-001": 0.30,
"gemini-1.5-flash-001-FC": 0.30,
"gemini-1.0-pro-002": 1.5,
"gemini-1.0-pro-002-FC": 1.5,
"databricks-dbrx-instruct": 6.75,
"command-r-plus-FC": 15,
"command-r-plus": 15,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,11 @@
"mistral-large-2407-FC",
"mistral-small-2402-FC",
"mistral-small-2402-FC",
"gemini-1.0-pro-FC",
"gemini-1.5-pro-preview-0409-FC",
"gemini-1.5-pro-preview-0514-FC",
"gemini-1.5-flash-preview-0514-FC",
"gemini-1.5-pro-002-FC",
"gemini-1.5-pro-001-FC",
"gemini-1.5-flash-002-FC",
"gemini-1.5-flash-001-FC",
"gemini-1.0-pro-002-FC",
"meetkai/functionary-small-v3.1-FC",
"meetkai/functionary-small-v3.2-FC",
"meetkai/functionary-medium-v3.1-FC",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,16 @@
"firefunction-v1-FC": FireworksHandler,
"firefunction-v2-FC": FireworksHandler,
"Nexusflow-Raven-v2": NexusHandler,
"gemini-1.0-pro-FC": GeminiHandler,
"gemini-1.5-pro-preview-0514-FC": GeminiHandler,
"gemini-1.5-flash-preview-0514-FC": GeminiHandler,
"gemini-1.5-pro-002": GeminiHandler,
"gemini-1.5-pro-002-FC": GeminiHandler,
"gemini-1.5-pro-001": GeminiHandler,
"gemini-1.5-pro-001-FC": GeminiHandler,
"gemini-1.5-flash-002": GeminiHandler,
"gemini-1.5-flash-002-FC": GeminiHandler,
"gemini-1.5-flash-001": GeminiHandler,
"gemini-1.5-flash-001-FC": GeminiHandler,
"gemini-1.0-pro-002": GeminiHandler,
"gemini-1.0-pro-002-FC": GeminiHandler,
"meetkai/functionary-small-v3.2-FC": FunctionaryHandler,
"meetkai/functionary-medium-v3.1-FC": FunctionaryHandler,
"databricks-dbrx-instruct": DatabricksHandler,
Expand Down Expand Up @@ -105,7 +112,8 @@
# "gpt-4-0613": OpenAIHandler,
# "claude-2.1": ClaudeHandler,
# "claude-instant-1.2": ClaudeHandler,
# "gemini-1.5-pro-preview-0409-FC": GeminiHandler,
# "gemini-1.0-pro-001": GeminiHandler,
# "gemini-1.0-pro-001-FC": GeminiHandler,
# "meetkai/functionary-small-v3.1-FC": FunctionaryHandler,
# "mistral-tiny-2312": MistralHandler,
# "glaiveai/glaive-function-calling-v1": GlaiveHandler,
Expand Down
Loading

0 comments on commit a94a4ce

Please sign in to comment.