Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug in k_quants.c #3636

Closed
shlevi-microsoft opened this issue Oct 15, 2023 · 2 comments · Fixed by #3646
Closed

bug in k_quants.c #3636

shlevi-microsoft opened this issue Oct 15, 2023 · 2 comments · Fixed by #3646
Assignees

Comments

@shlevi-microsoft
Copy link

shlevi-microsoft commented Oct 15, 2023

There is a bug in k_quants.c.

the loop index j is incremented by k in each iteration, which means that each iteration processes a block of k elements. starting at column j. However, the loop condition is j < nb. This condition may not be correct, because nb is calculated as k / QK_K, which is the number of blocks in the matrix, not the number of elements.

j < nb should be change to j < n
image

@coezbek
Copy link
Contributor

coezbek commented Oct 15, 2023

You are certainly right that since nb < k (because nb = k / QK_K) that the loop is only executed once (with j == 0)

On the other hand, if this is wrong then all the ggml_quantize_... methods would be wrong.

Maybe the code isn't used, but rather the code from ggml.c: https://github.com/ggerganov/llama.cpp/blob/master/ggml.c#L20565

@shlevi-microsoft
Copy link
Author

shlevi-microsoft commented Oct 16, 2023

the code is definitely used (I found it while debugging it), for example when you quantize to q4_k
I noticed that ggml_quantize_q4_l is only called with n = k, so in this case 1 loop iteration is valid.
but if k < n, it will lead to a bug.
To avoid this, consider fixing the code to handle cases where k < n properly.
The fix is very simple j < nb should be change to j < n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants