Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changelog : libllama API #9289

Open
ggerganov opened this issue Sep 3, 2024 · 4 comments
Open

changelog : libllama API #9289

ggerganov opened this issue Sep 3, 2024 · 4 comments
Labels
documentation Improvements or additions to documentation

Comments

@ggerganov
Copy link
Owner

ggerganov commented Sep 3, 2024

Overview

This is a list of changes to the public interface of the llama library. Collaborators are encouraged to edit this post in order to reflect important changes to the API that end up merged into the master branch.

If you are building a 3rd party project that relies on libllama, it is recommended to follow this issue and check it before upgrading to new versions.

See also:

Recent API changes (most recent at the top)

version PR desc
TBD #11063 Update llama_model API naming
b4357 #10784 Remove llama_model_get_tensor()
b4337 #10803 Change llama_sampler_init_penalties()
b4282 #10446 Removed support for Q4_0_N_M model files in favor of automatic repacking of Q4_0
b4167 #10497 Add devices to llama_model_params
b3948 #9897 Deprecate softmax sampler and update dist sampler`
b3988 #10071 Remove Tail-Free sampling
b3943 #9745 Removed all_pos_0, all_pos_1, all_seq_id from llama_batch
b3908 #9798 Update FIM-related API
b3841 #9510 Add LLAMA_POOLING_TYPE_RANK
b3774 #9512 Add llama_n_head()
b3750 #9355 Add llama_perf API + param to disable internal profiling
b3749 #9445 Add llama_sampler_chain_remove()
b3681 #9294 Major changes to the sampling API (see PR for more info)
b3651 #8980 Add LLAMA_VOCAB_TYPE_RWKV enum value
b3644 #8672 Add llama_threadpool API + change uint32_t -> int32_t
b3614 #8526 Add llama_model_is_recurrent

For older changes, use:

git log --oneline -p b3614 -- include/llama.h

(For collaborators) To link between PR number vs Build number:

git log --oneline | tail -r | nl

Upcoming API changes

  • TBD
@ggerganov ggerganov added the documentation Improvements or additions to documentation label Sep 3, 2024
@ggerganov ggerganov pinned this issue Sep 3, 2024
@ggerganov
Copy link
Owner Author

#9355 restores the functionality for getting performance measurements from within libllama (which was removed in #9294) via a new llama_perf API. The llama_context_params is extended with a new bool no_perf parameter that can be used to disable the internal timings during libllama compute.

@ddh0
Copy link
Contributor

ddh0 commented Jan 3, 2025

Looks like llama_model_get_tensor was removed from the API but that change was not documented here

@ggerganov
Copy link
Owner Author

Looks like llama_model_get_tensor was removed from the API but that change was not documented here

I didn't expect that this function is being used by anyone, so I skipped updating the changelog. It's updated now.

Btw, what do you use this call for?

@ddh0
Copy link
Contributor

ddh0 commented Jan 5, 2025

I didn't expect that this function is being used by anyone, so I skipped updating the changelog. It's updated now.

Btw, what do you use this call for?

I don't use it personally but the function was included in my Python code, I started to get ctypes "symbol not found" errors and I had to do some digging to figure out why. No worries!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants