-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add mlx support #1089
add mlx support #1089
Conversation
Documentation for this PR has been built. You can view it at: https://distilabel.argilla.io/pr-1089/ |
CodSpeed Performance ReportMerging #1089 will not alter performanceComparing Summary
|
update mlx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add the import of MlxLLM
to src/distilabel/llms.py
to avoid confusion until we deprecate it in 1.7.0
- Introduced MlxLLM class in mlx.py, integrating it into the llms module. - Updated output preparation logic to include token computation and logprobs in utils.py. - Modified __init__.py to export MlxLLM. - Enhanced type annotations for tokenizer_config and model_config in MlxLLM.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
- Imported LlamaCppEmbeddings in models/__init__.py. - Added LlamaCppEmbeddings to the __all__ exports in both models/__init__.py and embeddings/__init__.py. - Removed duplicate entry of LlamaCppEmbeddings from embeddings/__init__.py exports.
Closes #995
Use it individually
Use it with magpie
It is relatively easy to spin up an mlx server, but no public Python API clients are available except for LangChain https://python.langchain.com/docs/integrations/chat/mlx/. Currently, the OpenAIAPI does not align with the payloads from both the chat and the text generation API.