Update special token handling for new llama.cpp API #1105

afg1 · 2025-01-31T12:08:07Z

Guidance currently uses the now deprecated llama_token_get_text, this PR updates it to use 'llama_vocab_get_text` instead. I think the functionality is completely equivalent.

Deprecation change is suggested here: https://github.com/ggerganov/llama.cpp/blob/5783575c9d99c4d9370495800663aa5397ceb0be/include/llama.h#L962

Support for DeepSeek models requires an upgrade to llama-cpp-python==0.3.7, under which guidance==0.2.0 throws either a Bus Error or Segmentation Fault at

guidance/guidance/models/llama_cpp/_llama_cpp.py

Line 78 in aa9da9d

tok = llama_cpp.llama_token_get_text(model_obj.model, i)

I've not been able to run all tests, but I did run
python -m pytest --selected_model llamacpp_phi3_mini_4k_instruct_cpu ./tests -k llama_cpp
for which all the 28 selected tests passed. The same tests fail with the bus error on aa9da9d

…_text` instead As suggested here: https://github.com/ggerganov/llama.cpp/blob/5783575c9d99c4d9370495800663aa5397ceb0be/include/llama.h#L962 Requires getting the vocab out of the model first though

Harsha-Nori · 2025-02-03T20:53:25Z

Thanks for this @afg1 -- really appreciate the catch! Need to have a quick think on how much we want to preserve back-compat for old llama-cpp-python versions, but other than that this looks great to me :).

hudson-ai · 2025-02-04T17:09:25Z

@Harsha-Nori I'm personally not too woried about backwards compat here (unless we see a flood of issues demanding we restore support for an old version 😉).

LGTM, but @Harsha-Nori I'll let you have the final call on whether to do something about backwards compat.

Thank you @afg1 !

codecov-commenter · 2025-02-05T00:48:35Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 53.91%. Comparing base (ba8c34d) to head (d14f2ed).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
guidance/models/llama_cpp/_llama_cpp.py	75.00%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1105      +/-   ##
==========================================
+ Coverage   50.85%   53.91%   +3.06%     
==========================================
  Files          71       71              
  Lines        5880     5883       +3     
==========================================
+ Hits         2990     3172     +182     
+ Misses       2890     2711     -179

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Harsha-Nori · 2025-02-05T07:09:15Z

merged in -- thanks again @afg1!

Don't use deprecated llama_token_get_text, but use 'llama_vocab_get…

a9f3ded

…_text` instead As suggested here: https://github.com/ggerganov/llama.cpp/blob/5783575c9d99c4d9370495800663aa5397ceb0be/include/llama.h#L962 Requires getting the vocab out of the model first though

paulbkoch force-pushed the main branch from c291eda to c4531ea Compare February 3, 2025 08:45

hudson-ai approved these changes Feb 4, 2025

View reviewed changes

Merge branch 'main' into main

fcc9668

paulbkoch added 2 commits February 4, 2025 16:59

update llama-cpp-python version to 0.3.7

52923a4

check for invalid return from llama_model_get_vocab

d14f2ed

Harsha-Nori merged commit c7ec824 into guidance-ai:main Feb 5, 2025
26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update special token handling for new llama.cpp API #1105

Update special token handling for new llama.cpp API #1105

afg1 commented Jan 31, 2025

Harsha-Nori commented Feb 3, 2025

hudson-ai commented Feb 4, 2025 •

edited

Loading

codecov-commenter commented Feb 5, 2025 •

edited

Loading

Harsha-Nori commented Feb 5, 2025 •

edited

Loading

Update special token handling for new llama.cpp API #1105

Update special token handling for new llama.cpp API #1105

Conversation

afg1 commented Jan 31, 2025

Harsha-Nori commented Feb 3, 2025

hudson-ai commented Feb 4, 2025 • edited Loading

codecov-commenter commented Feb 5, 2025 • edited Loading

Codecov Report

Harsha-Nori commented Feb 5, 2025 • edited Loading

hudson-ai commented Feb 4, 2025 •

edited

Loading

codecov-commenter commented Feb 5, 2025 •

edited

Loading

Harsha-Nori commented Feb 5, 2025 •

edited

Loading