gguf dequantize failed #31725

PenutChen · 2024-07-01T09:13:52Z

System Info

transformers==4.42.3
torch==2.3.0

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

The example usage from doc:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
filename = "tinyllama-1.1b-chat-v1.0.Q6_K.gguf"

tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)

Expected behavior

Produce the following error:

Converting and de-quantizing GGUF tensors...:   0%|                         | 0/201 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/data2/Penut/LLM-Backend/hello.py", line 7, in <module>
    model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/Penut/.miniconda/envs/Py311/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/Penut/.miniconda/envs/Py311/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3583, in from_pretrained
    state_dict = load_gguf_checkpoint(gguf_path, return_tensors=True)["tensors"]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/Penut/.miniconda/envs/Py311/lib/python3.11/site-packages/transformers/modeling_gguf_pytorch_utils.py", line 146, in load_gguf_checkpoint
    weights = load_dequant_gguf_tensor(shape=shape, ggml_type=tensor.tensor_type, data=tensor.data)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data2/Penut/.miniconda/envs/Py311/lib/python3.11/site-packages/transformers/integrations/ggml.py", line 499, in load_dequant_gguf_tensor
    values = dequantize_q6_k(data)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/data2/Penut/.miniconda/envs/Py311/lib/python3.11/site-packages/transformers/integrations/ggml.py", line 284, in dequantize_q6_k
    data_f16 = np.frombuffer(data, dtype=np.float16).reshape(num_blocks, block_size // 2)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot reshape array of size 26880000 into shape (152,105)

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-07-01T09:26:41Z

cc @SunMarc

PenutChen · 2024-07-02T08:58:16Z

~~The correct workaround is to replace num_blocks in this code with -1, but I'm not sure if this is the correct behavior.~~

# transformers/integrations/ggml.py

def dequantize_q6_k(data):
    block_size = GGML_BLOCK_SIZES["Q6_K"]
    num_blocks = len(data) // block_size

    data_f16 = np.frombuffer(data, dtype=np.float16).reshape(-1, block_size // 2)
    data_u8 = np.frombuffer(data, dtype=np.uint8).reshape(-1, block_size)
    data_i8 = np.frombuffer(data, dtype=np.int8).reshape(-1, block_size)

    scales = data_f16[:, -1].reshape(-1, 1).astype(np.float32)

SunMarc · 2024-07-02T10:22:05Z

Hey @PenutChen thanks for opening the issue ! I tried your snippet on the main branch of transformers and on v4.42.3, and everything looks fine ! I suggest you to clear your cache and try it again. Also, which version of numpy are you using ? Maybe this is an issue with the 2.0 version was released recently.

PenutChen · 2024-07-02T11:05:30Z

@SunMarc Thanks for the reply! I upgraded the numpy version to 1.26.4, but I still get the same error. After checking all my dependencies, I found that my gguf was installed from the source of the llama.cpp repo. I changed the version to the PyPI one, and it works!

SunMarc · 2024-07-02T12:02:26Z

Thanks for investigating ! Hopefully, for the next release of gguf, we won't have the issue you experienced.

PenutChen · 2024-07-03T01:43:52Z

The latest release of the gguf package is from Dec 13, 2023, but the gguf source still updates frequently. There are some incompatible settings between them. For anyone experiencing this issue, try the following commands:

pip install gguf==0.6.0 "numpy<2.0" --force-reinstall

PenutChen · 2024-07-12T01:03:00Z

Hi @SunMarc, just a reminder that gguf-py has been updated to 0.9.1 recently. There might be some issues with this version. If I find anything new, I will reopen this issue.

SunMarc · 2024-07-12T11:37:45Z

Hi @PenutChen, thanks for the warning ! It looks like we indeed have failing tests on side. We get the same error you experienced. I will reopen the issue =)

gelbartm · 2024-07-18T10:29:59Z

downgrading to gguf==0.6.0 solved it for me. Thanks for @PenutChen hint.

github-actions · 2024-08-12T08:04:23Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

PenutChen · 2024-08-12T08:06:16Z

solved by #32298

amyeroberts added the GGUF label Jul 1, 2024

PenutChen closed this as completed Jul 3, 2024

PenutChen mentioned this issue Jul 3, 2024

Converting gguf fp16 & bf16 to hf is not supported. #31762

Closed

4 tasks

SunMarc reopened this Jul 12, 2024

SunMarc mentioned this issue Jul 16, 2024

Add support for GGUF Phi-3 #31844

Merged

5 tasks

PenutChen closed this as completed Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gguf dequantize failed #31725

gguf dequantize failed #31725

PenutChen commented Jul 1, 2024

amyeroberts commented Jul 1, 2024

PenutChen commented Jul 2, 2024 •

edited

Loading

SunMarc commented Jul 2, 2024

PenutChen commented Jul 2, 2024

SunMarc commented Jul 2, 2024

PenutChen commented Jul 3, 2024 •

edited

Loading

PenutChen commented Jul 12, 2024

SunMarc commented Jul 12, 2024

gelbartm commented Jul 18, 2024

github-actions bot commented Aug 12, 2024

PenutChen commented Aug 12, 2024

gguf dequantize failed #31725

gguf dequantize failed #31725

Comments

PenutChen commented Jul 1, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

amyeroberts commented Jul 1, 2024

PenutChen commented Jul 2, 2024 • edited Loading

SunMarc commented Jul 2, 2024

PenutChen commented Jul 2, 2024

SunMarc commented Jul 2, 2024

PenutChen commented Jul 3, 2024 • edited Loading

PenutChen commented Jul 12, 2024

SunMarc commented Jul 12, 2024

gelbartm commented Jul 18, 2024

github-actions bot commented Aug 12, 2024

PenutChen commented Aug 12, 2024

PenutChen commented Jul 2, 2024 •

edited

Loading

PenutChen commented Jul 3, 2024 •

edited

Loading