Releases · ggerganov/llama.cpp

06 Jan 13:05

96a1dc2

b4426 Latest

Latest

llama : prevent system info string accumulation across calls (#11101)

Assets 23

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-01-06T13:05:02Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-01-06T13:05:09Z
llama-b4426-bin-macos-arm64.zip

13 MB 2025-01-06T13:05:15Z
llama-b4426-bin-macos-x64.zip

13.9 MB 2025-01-06T13:05:16Z
llama-b4426-bin-ubuntu-x64.zip

15.5 MB 2025-01-06T13:05:17Z
llama-b4426-bin-win-avx-x64.zip

9.77 MB 2025-01-06T13:05:18Z
llama-b4426-bin-win-avx2-x64.zip

9.78 MB 2025-01-06T13:05:18Z
llama-b4426-bin-win-avx512-x64.zip

9.79 MB 2025-01-06T13:05:19Z
llama-b4426-bin-win-cuda-cu11.7-x64.zip

147 MB 2025-01-06T13:05:20Z
llama-b4426-bin-win-cuda-cu12.4-x64.zip

147 MB 2025-01-06T13:05:23Z
Source code (zip)

2025-01-06T11:21:46Z
Source code (tar.gz)

2025-01-06T11:21:46Z

06 Jan 12:51

github-actions

b4425

6369f86

b4425

llama : rename missed batch params/vars to ubatch (#10059)

This commit renames the `batch` parameter to `ubatch` in the
`llama_kv_cache_find_slot`, `llm_build_inp_embd`, and
`llm_build_mamba` functions.

The motivation for this is that this should have been done as part of
Commit 19d900a7565b8f6b0a708836a57d26966cb9efe2 ("llama : rename batch
to ubatch (#9950)") but for some reason I missed these functions in
that commit and only noticed them now (sorry).

Assets 23

06 Jan 12:49

github-actions

b4424

47182dd

b4424

llama : update llama_model API names (#11063)

* llama : deprecate llama_free_model, add llama_model_free

ggml-ci

* llama : change `llama_load_model_from_file` -> `llama_model_load_from_file`

ggml-ci

Assets 23

06 Jan 12:22

github-actions

b4423

3e6e7a6

b4423

tokenize : escape the prompt (#11058)

* tokenize : escape the prompt

* tokenize : update help

Assets 23

06 Jan 12:02

github-actions

b4422

ae2f606

b4422

mmap : fix fileno macro clash (#11076)

* mmap : fix fileno macro clash

ggml-ci

* cont

ggml-ci

Assets 23

06 Jan 12:00

github-actions

b4421

727368c

b4421

llama : use LLAMA_TOKEN_NULL (#11062)

ggml-ci

Assets 23

06 Jan 12:00

github-actions

b4420

5047dd3

b4420

llama : use _impl suffix instead of _internal (#11060)

ggml-ci

Assets 23

06 Jan 02:18

github-actions

b4419

46e3556

b4419

CUDA: add BF16 support (#11093)

* CUDA: add BF16 support

Assets 23

04 Jan 20:57

github-actions

b4418

b56f079

b4418

Vulkan: Add device-specific blacklist for coopmat for the AMD proprie…

Assets 23

04 Jan 20:50

github-actions

b4417

9394bbd

b4417

llama : Add support for DeepSeek V3 (#11049)

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggerganov/llama.cpp

b4426

b4425

b4424

b4423

b4422

b4421

b4420

b4419

b4418

b4417