Allow conversion of Llama / Mistral HF models #6144

pcuenca · 2024-03-18T17:58:44Z

This allows to convert fine-tuned models with convert-hf-to-gguf.py. The base architecture is set to llama, as in the models converted by @TheBloke. If necessary, we can add a new entry to constants.py.

cebtenzzre · 2024-03-18T19:32:46Z

convert-hf-to-gguf.py

Why not just use:

@Model.register("LlamaForCausalLM", "MistralForCausalLM", "MixtralForCausalLM")

on the existing MixtralModel (call it LlamaModel)? I don't see a point in supporting Mistral in this script without supporting Llama, and these classes are identical so they can be merged.

Sure, we can do that. I tried to mimic what I saw in the existing entries.

In my opinion, it could be interesting to use different architectures for Mistral (and Mixtral) for informative purposes. If you click on the GGUF information for any Mistral file on the Hugging Face Hub, there's nothing that refers to mistral except the filename. But that would also require additional changes, so happy to use the same entry for the three of them!

cebtenzzre · 2024-03-18T20:34:54Z

I tried to convert this model: https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca

I got this exception:

$ TMPDIR=/var/tmp ./convert-hf-to-gguf.py ~/dirs/text-ai-models/dl/Mistral-7B-OpenOrca --outfile /dev/null
Loading model: Mistral-7B-OpenOrca
gguf: This GGUF file is for Little Endian only
Set model parameters
Set model tokenizer
Traceback (most recent call last):
  File "/home/jared/src/forks/llama.cpp/./convert-hf-to-gguf.py", line 2073, in <module>
    main()
  File "/home/jared/src/forks/llama.cpp/./convert-hf-to-gguf.py", line 2060, in main
    model_instance.set_vocab()
  File "/home/jared/src/forks/llama.cpp/./convert-hf-to-gguf.py", line 1051, in set_vocab
    self._set_vocab_sentencepiece()
  File "/home/jared/src/forks/llama.cpp/./convert-hf-to-gguf.py", line 324, in _set_vocab_sentencepiece
    piece = tokenizer.id_to_piece(token_id)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jared/.venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 1179, in _batched_func
    return _func(self, arg)
           ^^^^^^^^^^^^^^^^
  File "/home/jared/.venv/lib/python3.11/site-packages/sentencepiece/__init__.py", line 1172, in _func
    raise IndexError('piece id is out of range.')
IndexError: piece id is out of range.

This model converts fine with convert.py.

pcuenca · 2024-03-26T09:43:11Z

convert-hf-to-gguf.py

    model_arch = gguf.MODEL_ARCH.LLAMA

    def set_vocab(self):
-        self._set_vocab_sentencepiece()
+        self._set_vocab_hf()


Testing conversion with fine-tuned models I experienced this problem: #6320. If that PR is merged, then we can also use _set_vocab_sentencepiece() here.

pcuenca · 2024-03-26T09:46:01Z

Sorry for the delay, @cebtenzzre, I could only return to this today. Testing the model you mentioned (https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) I experienced an inconsistency in the sentencepiece vocab method as reported in #6320. In addition, I had not taken care of tensor permutation in my initial PR.

I tested conversion of https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca and verified that generation matches conversion with convert.py using a temperature of 0.

pcuenca · 2024-03-27T17:01:12Z

It looks like the CI timed out in 6 hours.

cebtenzzre · 2024-03-28T15:08:19Z

This PR conflicts with #6355 which renames HfVocab to LlamaHfVocab and makes it specific to models with tokenizer.json - Mistral-7B-OpenOrca only has a tokenizer.model.

To be consistent with convert.py with the default --vocab-type, after that PR is merged you would want to do something like:

try:
    self. _set_vocab_sentencepiece()
except FileNotFoundError:
    self._set_vocab_llama_hf()

This benefits from a conditional dependency on transformers (sentencepiece is a required dependency atm) and accurate token scores when tokenizer.model is available. Does that seem reasonable?

pcuenca · 2024-03-28T18:07:16Z

@cebtenzzre yes, makes total sense! I merged and applied those changes, then tested with Mistral-7B-OpenOrca and Mistral-7B-v0.1.

cebtenzzre · 2024-03-28T22:27:03Z

With the changes I added, the set of metadata keys is exactly the same as those written by convert.py (checked with gguf-dump), with the only difference that general.name is "dl" with convert.py (it uses the parent directory name, which supposedly has meaning for the original llama checkpoints) and "Mistral-7B-OpenOrca" with convert-hf-to-gguf.py, which is much more useful.

pcuenca · 2024-03-28T22:48:43Z

Thank you! Much appreciated! 🙌

* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <[email protected]>

Allow conversion of Mistral HF models

77f313e

cebtenzzre reviewed Mar 18, 2024

View reviewed changes

Homogenize Llama, Mistral, Mixtral under the same entry.

eecaf58

pcuenca requested a review from cebtenzzre March 18, 2024 20:03

Fix tokenizer, permute tensors

876b70d

pcuenca commented Mar 26, 2024

View reviewed changes

pcuenca added 2 commits March 28, 2024 18:40

Merge remote-tracking branch 'origin/master' into mistral-hf-conversion

23ffda0

Use sentencepiece tokenizer, or fall back to hfft.

16ede02

cebtenzzre added 3 commits March 28, 2024 18:01

convert-hf : small fix for mypy

6dba2de

convert-hf : fix duplicated block_count

9bf29e6

convert-hf : add vocab size to metadata

f33b6c7

cebtenzzre approved these changes Mar 28, 2024

View reviewed changes

ggerganov merged commit b75c381 into ggerganov:master Mar 29, 2024
11 of 22 checks passed

pcuenca deleted the mistral-hf-conversion branch March 29, 2024 09:22

pcuenca changed the title ~~Allow conversion of Mistral HF models~~ Allow conversion of Llama / Mistral HF models Mar 29, 2024

pcuenca mentioned this pull request Apr 22, 2024

Refactor convert.py and add support for Metas official Llama 3 model #6819

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow conversion of Llama / Mistral HF models #6144

Allow conversion of Llama / Mistral HF models #6144

pcuenca commented Mar 18, 2024 •

edited

Loading

cebtenzzre Mar 18, 2024 •

edited

Loading

pcuenca Mar 18, 2024

cebtenzzre commented Mar 18, 2024

pcuenca Mar 26, 2024

pcuenca commented Mar 26, 2024

pcuenca commented Mar 27, 2024

cebtenzzre commented Mar 28, 2024

pcuenca commented Mar 28, 2024

cebtenzzre commented Mar 28, 2024

pcuenca commented Mar 28, 2024

Allow conversion of Llama / Mistral HF models #6144

Allow conversion of Llama / Mistral HF models #6144

Conversation

pcuenca commented Mar 18, 2024 • edited Loading

cebtenzzre Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

pcuenca Mar 18, 2024

Choose a reason for hiding this comment

cebtenzzre commented Mar 18, 2024

pcuenca Mar 26, 2024

Choose a reason for hiding this comment

pcuenca commented Mar 26, 2024

pcuenca commented Mar 27, 2024

cebtenzzre commented Mar 28, 2024

pcuenca commented Mar 28, 2024

cebtenzzre commented Mar 28, 2024

pcuenca commented Mar 28, 2024

pcuenca commented Mar 18, 2024 •

edited

Loading

cebtenzzre Mar 18, 2024 •

edited

Loading