-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: b3028 breaks mixtral 8x22b #7969
Comments
Could you point the exact model file you used? |
I tested on 3 different files: 2), 3) My own quants made with latest version with model downloaded straight from mistralai, b3028 tokenizer patch breaks all of them. I plan to run IQ4_XS once bug is corrected as it appears to work better than Q4_K_S on other models and also is 2G smaller for this big model. |
I added b3028 revert patches to my github here: https://github.com/steampunque/llama.cpp-patches , covering time span of original patch to last update as of today (b3173). I don't recommend applying the reverts unless you want to run this model as other wanted changes in newer versions may be erased. Without patch, b3173:
Apply revert_b3028_from_3173.patch :
|
Post the |
b3181 with no revert. Use any of the model files I summarized earlier.
Aborted conversation:
b3181 with b3028 reverted:
|
I don't have 8x22B handy, but with 8x7B I am not able to reproduce using that command: make -j && ./llama-cli -m ./models/mixtral-instruct-8x7b-fast/ggml-model-q4_k.gguf --color -n -1 --multiline-input --interactive-first -ngl 7 -c 8192 -ctk f16 -ctv f16 -b 128 -fa -n 8192 --keep 0 --temp 0.0 --dynatemp-range 0.0 --dynatemp-exp 1.0 --top-k 40 --top-p 0.95 --typical 1.0 --min-p 0.00 --repeat-last-n 64 --repeat-penalty 1.0 --presence-penalty 0.0 --frequency-penalty 0.0 --tfs 1.0 --mirostat 0 --mirostat-lr 0.1 --mirostat-ent 5.0 -p "" --in-prefix "[INST] " --in-suffix " [/INST]" --verbose-prompt
You can try to remove the whitespaces from the instruction suffix/prefix: |
I downloaded the model and indeed there is a regression for the Mixtral 8x22B models. The Before: ./tokenize -m Mixtral-8x22B-Instruct-v0.1.IQ4_XS-00001-of-00002.gguf -p "[INST]"
3 -> '[INST]' Now: 1501 -> ' ['
17057 -> 'INST'
29561 -> ']' cc @jaime-m-p |
Below appears to be the list of special tokens the model wants from tokenizer.json. I think mistral instruct v0.3 moved to special token for [INST] [/INST] also (same time the function stuff was added and vocabulary expanded) and it { |
I'm trying to find the root of the problem. Found some differences while loading special tokens: dir_tokenizer = "'./models/tokenizers//mixtral8x22b'"
tokenizer = AutoTokenizer.from_pretrained(dir_tokenizer)
tokenizer.added_tokens_encoder
{'<unk>': 0, '<s>': 1, '</s>': 2, '[INST]': 3, '[/INST]': 4, '[TOOL_CALLS]': 5, '[AVAILABLE_TOOLS]': 6, '[/AVAILABLE_TOOLS]': 7, '[TOOL_RESULTS]': 8, '[/TOOL_RESULTS]': 9} import gguf
dir_tokenizer = "'./models/tokenizers//mixtral8x22b'"
special_vocab = gguf.SpecialVocab(dir_tokenizer)
special_vocab.special_token_ids
{'bos': 1, 'eos': 2, 'unk': 0} I have to look closer, but I'm confused in the method
Probably this: 938cb49. I think the correct option is try to fix |
In my opinion regen gguf should not be an issue. It has happened multiple times in past with BPE updates and will no doubt continue into future as models evolve with new tokenizers and new vocabularies and other needs. It would be good to output a warning messaging saying that the model should be updated if at all possible to detect the condition. Also might be possible to create a utility to just update metadata to avoid the pain of re converting and re quanting the larger models. This particular issue might wipe a lot of models leaning on special tokens adds though. |
If the |
I reproduced this in
b3027—no problems
b3028-b3086—strange prefix/suffix tokens
b3087-b3140—random mention of IELTS
|
Its still broken as of b3233. It is necessary to revert b3027->b3028 update if you want to run the model. |
This also broke Yi-1.5-34b.
Input: hi
b3028
In b3028, |
Update. MistralAI updated the tokenizer configs for 8x22b a couple days ago so I ran a new convert/quant and all looks OK now: SPECIAL=1 tokenize " lm Hello. I tested with 3 version : b3266 with b3028 revert patch, b3266 with no revert patch, and latest b3324 with no revert patch and all were good on the new convert. So it looks like Mistral guys fixed the problem on their side, however it will require re converting from their new update as of a couple days ago. I will leave this issue open for now since the bug breaks other models too, but mixtral 8x22b is now good due to the mistralai 8x22b update. |
Will there be a proper support for special tokens such that the model would see the difference between, for example, special token |
There is already support - see the |
Right. What I meant is support in examples. |
There will be, but can't say when - does not seem very high prio IMO |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
What happened?
Mixtral 8x22b model running with server.
b3027: good
lm hi
Hello! How can I help you today? Is there something specific you would like to talk about or ask about? I'm here to provide information and answer questions to the best of my ability.
b3028: garbage
lm hi
👋
[INST] I'm here to help you with your questions about the [/INST] 🤓
[INST] I can provide information on a variety of topics, such as [/INST] 📚
[INST] - [/INST] 🏫
Name and Version
b3027 for good run
b3028 for broken run
What operating system are you seeing the problem on?
Linux
Relevant log output
The text was updated successfully, but these errors were encountered: