Leaderboard Update, in sync with BFCL April 9th Release #341

HuanzhiMao · 2024-04-11T09:39:02Z

This PR updates the leaderboard data, as mentioned in #338. As a result, some values/scores are changed.
Note that the model glaiveai/glaive-function-calling-v1 is excluded in the leaderboard because when loading the model using transformers, we get the error AttributeError: 'ReplitLMTokenizer' object has no attribute 'sp_model'. This is a bug from the transformer's side on the specific tokenizer

CharlieJCJ

LGTM

This PR is for the BFCL April 9th release: 1. Bug fix in the evaluation dataset. This involves modifying both prompts and function docs. 2. Bug fix for possible answers. The detailed breakdown is attached below. If you spot any issue with our evaluation dataset and/or possible answers, please feel free to raise an issue! | Test Category | Prompt/Func Doc Correction Count | Possible Answer Correction Count | |---------------------|-----------------------------|-----------------------------| | Simple | 3 | 16 | | Parallel | 1 | 16| | Multiple | 1 | 11 | | Parallel Multiple | 10 | 43 | This PR **DOES** change the leaderboard score. We will update the leaderboard website shortly, in PR #341 --------- Co-authored-by: Charlie Cheng-Jie Ji <[email protected]> Co-authored-by: Fanjia Yan <[email protected]> --------- Co-authored-by: Charlie Cheng-Jie Ji <[email protected]>

This PR is for the BFCL April 9th release: 1. Bug fix in the evaluation dataset. This involves modifying both prompts and function docs. 2. Bug fix for possible answers. The detailed breakdown is attached below. If you spot any issue with our evaluation dataset and/or possible answers, please feel free to raise an issue! | Test Category | Prompt/Func Doc Correction Count | Possible Answer Correction Count | |---------------------|-----------------------------|-----------------------------| | Simple | 3 | 16 | | Parallel | 1 | 16| | Multiple | 1 | 11 | | Parallel Multiple | 10 | 43 | This PR **DOES** change the leaderboard score. We will update the leaderboard website shortly, in PR ShishirPatil#341 --------- Co-authored-by: Charlie Cheng-Jie Ji <[email protected]> Co-authored-by: Fanjia Yan <[email protected]> --------- Co-authored-by: Charlie Cheng-Jie Ji <[email protected]>

HuanzhiMao added 2 commits April 11, 2024 02:32

update data.csv. April 9

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired

Verified
Learn about vigilant mode

0ecd161

update treemap

6713be1

HuanzhiMao marked this pull request as ready for review April 11, 2024 09:39

HuanzhiMao mentioned this pull request Apr 11, 2024

BFCL April 9th Release (Dataset Bug Fix) #338

Merged

HuanzhiMao added 3 commits April 11, 2024 16:04

update data.csv

59e7fbf

Merge branch 'gh-pages' into gh-pages

93ddad7

update treemap

92dabb2

CharlieJCJ approved these changes Apr 11, 2024

View reviewed changes

ShishirPatil merged commit 06c7f9d into ShishirPatil:gh-pages Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leaderboard Update, in sync with BFCL April 9th Release #341

Leaderboard Update, in sync with BFCL April 9th Release #341

HuanzhiMao commented Apr 11, 2024 •

edited

Loading

CharlieJCJ left a comment

Leaderboard Update, in sync with BFCL April 9th Release #341

Leaderboard Update, in sync with BFCL April 9th Release #341

Conversation

HuanzhiMao commented Apr 11, 2024 • edited Loading

CharlieJCJ left a comment

Choose a reason for hiding this comment

HuanzhiMao commented Apr 11, 2024 •

edited

Loading