Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update hammer handler and add Hammer2.1 model #832

Merged
merged 4 commits into from
Dec 14, 2024

Conversation

linqq9
Copy link
Contributor

@linqq9 linqq9 commented Dec 13, 2024

Hello, we have updated the hammer handle and added Hammer2.1 series models, including Hammer2.1-7b, Hammer2.1-3b, Hammer2.1-1.5b and Hammer2.1-0.5b. The performance on BFCL-V3 is as follows:

Model Overall Acc Non-Live AST Acc Non-Live Exec Acc Live Acc Multi Turn Acc Relevance Detection Irrelevance Detection
MadeAgents/Hammer2.1-7b (FC) 63.4 88.5 85.98 75.12 28 77.78 78.74
MadeAgents/Hammer2.1-3b (FC) 59.29 87.15 84.14 73.79 18.13 77.78 81.87
MadeAgents/Hammer2.1-1.5b (FC) 55.54 82.88 84.12 70.64 12.12 77.78 79.22
MadeAgents/Hammer2.1-0.5b (FC) 45.02 69.12 68.04 62.46 2.88 77.78 74.47

@HuanzhiMao HuanzhiMao added the BFCL-New Model Add New Model to BFCL label Dec 14, 2024
Copy link
Collaborator

@HuanzhiMao HuanzhiMao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @linqq9 ,
Thanks for the PR and excited to see its strong performance!
I made a few minor fixes in this PR. The rest LGTM.

@linqq9
Copy link
Contributor Author

linqq9 commented Dec 14, 2024

Hi @linqq9 , Thanks for the PR and excited to see its strong performance! I made a few minor fixes in this PR. The rest LGTM.

Hi, thanks for your review and the fixes! I've already merged them. Appreciate your efforts. Thanks!

Bump Hammer 2.0 to 2.1 in Various Places
@HuanzhiMao HuanzhiMao merged commit 21828ee into ShishirPatil:main Dec 14, 2024
HuanzhiMao added a commit that referenced this pull request Dec 31, 2024
This PR updates the leaderboard to reflect the change in score due to
the following PR merge:

1. #822 
2. #826 
3. #829 
4. #832 
5. #837 
6. #840 
7. #835 
8. #842 
9.  #843 
10. #846 
11. #838 
12. #847 
13. #855 
14. #857 

Models were evaluated using checkpoint commit 0cea216.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFCL-New Model Add New Model to BFCL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants