-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow conversion of Llama / Mistral HF models #6144
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use:
@Model.register("LlamaForCausalLM", "MistralForCausalLM", "MixtralForCausalLM")
on the existing MixtralModel (call it LlamaModel)? I don't see a point in supporting Mistral in this script without supporting Llama, and these classes are identical so they can be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can do that. I tried to mimic what I saw in the existing entries.
In my opinion, it could be interesting to use different architectures for Mistral (and Mixtral) for informative purposes. If you click on the GGUF information for any Mistral file on the Hugging Face Hub, there's nothing that refers to mistral
except the filename. But that would also require additional changes, so happy to use the same entry for the three of them!
I tried to convert this model: https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca I got this exception:
This model converts fine with convert.py. |
convert-hf-to-gguf.py
Outdated
model_arch = gguf.MODEL_ARCH.LLAMA | ||
|
||
def set_vocab(self): | ||
self._set_vocab_sentencepiece() | ||
self._set_vocab_hf() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing conversion with fine-tuned models I experienced this problem: #6320. If that PR is merged, then we can also use _set_vocab_sentencepiece()
here.
Sorry for the delay, @cebtenzzre, I could only return to this today. Testing the model you mentioned (https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) I experienced an inconsistency in the sentencepiece vocab method as reported in #6320. In addition, I had not taken care of tensor permutation in my initial PR. I tested conversion of https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca and verified that generation matches conversion with |
It looks like the CI timed out in 6 hours. |
This PR conflicts with #6355 which renames HfVocab to LlamaHfVocab and makes it specific to models with tokenizer.json - Mistral-7B-OpenOrca only has a tokenizer.model. To be consistent with convert.py with the default --vocab-type, after that PR is merged you would want to do something like: try:
self. _set_vocab_sentencepiece()
except FileNotFoundError:
self._set_vocab_llama_hf() This benefits from a conditional dependency on transformers (sentencepiece is a required dependency atm) and accurate token scores when tokenizer.model is available. Does that seem reasonable? |
@cebtenzzre yes, makes total sense! I merged and applied those changes, then tested with Mistral-7B-OpenOrca and Mistral-7B-v0.1. |
With the changes I added, the set of metadata keys is exactly the same as those written by convert.py (checked with gguf-dump), with the only difference that |
Thank you! Much appreciated! 🙌 |
* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <[email protected]>
* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <[email protected]>
* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <[email protected]>
This allows to convert fine-tuned models with
convert-hf-to-gguf.py
. The base architecture is set tollama
, as in the models converted by @TheBloke. If necessary, we can add a new entry toconstants.py
.