Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update special token handling for new llama.cpp API #1105

Merged
merged 4 commits into from
Feb 5, 2025

Conversation

afg1
Copy link
Contributor

@afg1 afg1 commented Jan 31, 2025

Guidance currently uses the now deprecated llama_token_get_text, this PR updates it to use 'llama_vocab_get_text` instead. I think the functionality is completely equivalent.

Deprecation change is suggested here: https://github.com/ggerganov/llama.cpp/blob/5783575c9d99c4d9370495800663aa5397ceb0be/include/llama.h#L962

Support for DeepSeek models requires an upgrade to llama-cpp-python==0.3.7, under which guidance==0.2.0 throws either a Bus Error or Segmentation Fault at

tok = llama_cpp.llama_token_get_text(model_obj.model, i)

I've not been able to run all tests, but I did run
python -m pytest --selected_model llamacpp_phi3_mini_4k_instruct_cpu ./tests -k llama_cpp
for which all the 28 selected tests passed. The same tests fail with the bus error on aa9da9d

@Harsha-Nori
Copy link
Collaborator

Thanks for this @afg1 -- really appreciate the catch! Need to have a quick think on how much we want to preserve back-compat for old llama-cpp-python versions, but other than that this looks great to me :).

@hudson-ai
Copy link
Collaborator

hudson-ai commented Feb 4, 2025

@Harsha-Nori I'm personally not too woried about backwards compat here (unless we see a flood of issues demanding we restore support for an old version 😉).

LGTM, but @Harsha-Nori I'll let you have the final call on whether to do something about backwards compat.

Thank you @afg1 !

@codecov-commenter
Copy link

codecov-commenter commented Feb 5, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 53.91%. Comparing base (ba8c34d) to head (d14f2ed).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
guidance/models/llama_cpp/_llama_cpp.py 75.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1105      +/-   ##
==========================================
+ Coverage   50.85%   53.91%   +3.06%     
==========================================
  Files          71       71              
  Lines        5880     5883       +3     
==========================================
+ Hits         2990     3172     +182     
+ Misses       2890     2711     -179     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Harsha-Nori Harsha-Nori merged commit c7ec824 into guidance-ai:main Feb 5, 2025
26 checks passed
@Harsha-Nori
Copy link
Collaborator

Harsha-Nori commented Feb 5, 2025

merged in -- thanks again @afg1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants