Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to split model with --split-max-size, but gguf-split ignores it #6654

Closed
RichardErkhov opened this issue Apr 13, 2024 · 2 comments · Fixed by #6655
Closed

Trying to split model with --split-max-size, but gguf-split ignores it #6654

RichardErkhov opened this issue Apr 13, 2024 · 2 comments · Fixed by #6655
Labels
bug Something isn't working split GGUF split model sharding

Comments

@RichardErkhov
Copy link

RichardErkhov commented Apr 13, 2024

Latest version, ubuntu 2204, conda python=3.10.
Trying to split model with gguf-split, but something is going wrong

(base) richard@richard-ProLiant-DL580-Gen9:~/Desktop/ramdisk/banana/llama.cpp$ ./gguf-split --split --split-max-size 4000M --dry-run /media/richard/5fbd0bfa-8253-4803-85eb-80a13218a927/grok-1-fp16-gguf/grok-1-Q5_K.gguf Q5_K/grok-1 
n_split: 1
split 00001: n_tensors = 2115, total_size = 214437M
gguf_split: 1 gguf split written with a total of 2115 tensors.
(base) richard@richard-ProLiant-DL580-Gen9:~/Desktop/ramdisk/banana/llama.cpp$ ./gguf-split --split --split-max-size 4G --dry-run /media/richard/5fbd0bfa-8253-4803-85eb-80a13218a927/grok-1-fp16-gguf/grok-1-Q5_K.gguf Q5_K/grok-1 
n_split: 17
split 00001: n_tensors = 128, total_size = 14609M
split 00002: n_tensors = 128, total_size = 13184M
split 00003: n_tensors = 128, total_size = 12648M
split 00004: n_tensors = 128, total_size = 12597M
split 00005: n_tensors = 128, total_size = 12648M
split 00006: n_tensors = 128, total_size = 12750M
split 00007: n_tensors = 128, total_size = 12836M
split 00008: n_tensors = 128, total_size = 13088M
split 00009: n_tensors = 128, total_size = 13197M
split 00010: n_tensors = 128, total_size = 12597M
split 00011: n_tensors = 128, total_size = 12597M
split 00012: n_tensors = 128, total_size = 12699M
split 00013: n_tensors = 128, total_size = 12699M
split 00014: n_tensors = 128, total_size = 12597M
split 00015: n_tensors = 128, total_size = 13137M
split 00016: n_tensors = 128, total_size = 13675M
split 00017: n_tensors = 67, total_size = 6868M
gguf_split: 17 gguf split written with a total of 2115 tensors.
@phymbert phymbert added split GGUF split model sharding bug Something isn't working and removed bug-unconfirmed labels Apr 13, 2024
@phymbert
Copy link
Collaborator

See:

@RichardErkhov
Copy link
Author

See:

So basically it doesnt work and I need to use split max tensors for now ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working split GGUF split model sharding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants