Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/954 llama cpp #1000

Merged
merged 34 commits into from
Jan 9, 2025

Conversation

bikash119
Copy link
Contributor

This PR adds llama-cpp support to create embeddings.

  from distilabel.embeddings import LlamaCppEmbeddings

  embeddings = LlamaCppEmbeddings(model="second-state/all-MiniLM-L6-v2-Q2_K.gguf")

  embeddings.load()

  results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
  # [
  #   [-0.05447685346007347, -0.01623094454407692, ...],
  #   [4.4889533455716446e-05, 0.044016145169734955, ...],
  # ]

@bikash119
Copy link
Contributor Author

bikash119 commented Sep 25, 2024

@davidberenstein1957 : I have added a working version for llama-cpp support to generate embeddings. Will add more parameters as we have here

…beddings should be normalized

- Added testcases to test normalize embeddings
@bikash119 bikash119 marked this pull request as ready for review September 25, 2024 14:01
@bikash119
Copy link
Contributor Author

  • Create embeddings using model from local disk
  • Create embeddings using huggingface hub model
  • Normalize embeddings by passing normalize_embeddings=True to LlamaCppEmbeddings class

@bikash119
Copy link
Contributor Author

@davidberenstein1957 : May I request you to review the changes. In addition, I couldn't add the documentation to Component Library or to API Reference. I need a little help to add / update documentation.

Copy link
Member

@davidberenstein1957 davidberenstein1957 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good already. I left some initial comments. Have you seen this already #999?

pyproject.toml Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
bikash119 and others added 8 commits September 26, 2024 08:08
Accept recommended suggestion

Co-authored-by: David Berenstein <[email protected]>
- Incorporated changes suggested in review comments.
- use atexit to forcefully invoke cleanup
- Add test_encode_batch_consistency to ensure consistent results
- Test large batch processing capability
- Verify embedding dimensions and count for different batch sizes
@davidberenstein1957
Copy link
Member

davidberenstein1957 commented Sep 30, 2024 via email

@bikash119
Copy link
Contributor Author

@davidberenstein1957 : Ok , let me create a different PR to handle the scenario. I will keep the this PR untouched and once we have the scenario handled using mixins, will remove the token handling code from LlamaCppEmbeddings class.

@bikash119 bikash119 mentioned this pull request Oct 2, 2024
@bikash119
Copy link
Contributor Author

Hi @davidberenstein1957 : I have made the changes to make model download from hub reusable through mixin. Please share your feedback. If this looks good, will create another PR to handle the model download in llamacpp class of llm too.

Copy link
Member

@davidberenstein1957 davidberenstein1957 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to update the docs and API references :)

src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/mixins/hub_downloader.py Outdated Show resolved Hide resolved
…ngs."

This reverts commit 778532f.
HF_TOKEN can be set as env variable to download gated model
- alligned the attribute as per the review comments.
@bikash119
Copy link
Contributor Author

bikash119 commented Oct 3, 2024

Don't forget to update the docs and API references :)

Have included the LlamaCppEmbeddings class to __init__.py under embeddings folder. On executing mkdocs serve I could see the documenation in Component Gallery and API reference.

…r --cpu-only.

`pytest tests/unit/embeddings/test_llamacpp.py --cpu-only` will generate embeddings using cpu
`pytest tests/unit/embeddings/test_llamacpp.py` will generate embeddings using gpu
@bikash119
Copy link
Contributor Author

hi @davidberenstein1957 : Thank you for your time to review my work. I am tried to incorporate your feedback and did a few improvements.

  • Added example for user to use both public and private model. For private models, the user has to configure export HF_TOKEN=hf... .
  • Added an option to test --cpu-only to test embedding generation on cpu.
  • Added examples for public , private, cpu based generation.
    Please share your feedback.

Copy link
Member

@davidberenstein1957 davidberenstein1957 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Left some minor comments. :)

src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
tests/unit/embeddings/test_llamacpp.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
from llama_cpp import Llama


class LlamaCppEmbeddings(Embeddings, CudaDevicePlacementMixin):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aren't some of the attributes already present in the parent class?

src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
tests/unit/conftest.py Outdated Show resolved Hide resolved
tests/unit/embeddings/test_llamacpp.py Outdated Show resolved Hide resolved
tests/unit/embeddings/test_llamacpp.py Outdated Show resolved Hide resolved
bikash119 and others added 4 commits October 14, 2024 17:19
try except block is not needed.

Co-authored-by: David Berenstein <[email protected]>
hidden attributes shouldn't be documented.

Co-authored-by: David Berenstein <[email protected]>
Copy link
Member

@davidberenstein1957 davidberenstein1957 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bikash119!

@davidberenstein1957 davidberenstein1957 merged commit 99c2448 into argilla-io:develop Jan 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants