Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NousResearch Hermes models to no cost models list #564

Closed

Conversation

alonsosilvaallende
Copy link
Contributor

Add NousResearch Hermes models to no cost models list

Copy link
Collaborator

@HuanzhiMao HuanzhiMao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alonsosilvaallende,
The Hermes model should not be added to the NO_COST_MODELS list.
We use the following formula to estimate the cost and latency for locally-hosted models. image
Since the Hermes model can be loaded into 8 x V100 machines, it will have the Cost and Latency fields computed as above.

@alonsosilvaallende
Copy link
Contributor Author

Thank you very much for your explanation @HuanzhiMao
What does 'no cost' mean? I see that Salesforce/xLAM-7b-fc-r has no cost. Does that mean the cost could not be determined?

@HuanzhiMao
Copy link
Collaborator

Thank you very much for your explanation @HuanzhiMao What does 'no cost' mean? I see that Salesforce/xLAM-7b-fc-r has no cost. Does that mean the cost could not be determined?

Yes, because the xLAM model cannot be loaded into 8 x V100 machines (bfloat16 is not compatible with V100). See here.

@alonsosilvaallende
Copy link
Contributor Author

I understand. The name was misleading me. Last question: Also the functionary small models meetkai/functionary-small-v2.4 cannot be run on 8 x V100 machines?

@HuanzhiMao
Copy link
Collaborator

I understand. The name was misleading me. Last question: Also the functionary small models meetkai/functionary-small-v2.4 cannot be run on 8 x V100 machines?

Yes, you are right. functionary-small does require bfloat16. We will fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants