We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With Mixtral 8x7b, [INST] and [/INST] are not tokenized correctly.
[INST]
[/INST]
[Debug: Dump Forwarded Input Tokens, format: 6] ' (28705)', '\n (13)', '[ (28792)', 'INST (16289)', '] (28793)', ' hi (12014)', ' [ (733)', '/ (28748)', 'INST (16289)', '] (28793)', '\n (13)',
Model: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 Instruct Tag Preset: Mistral
Ggerganov also noted this problem on llama.cpp, although there is only happens for Mixtral 8x22b, not 8x7b. ggml-org#7969 (comment)
With WizardLM-2-8x22B, this also happens with USER: and ASSISTANT:.
USER:
ASSISTANT:
[Debug: Dump Forwarded Input Tokens, format: 6] '<s> (1)', ' (28705)', '\n (13)', 'USER (11123)', ': (28747)', ' hi (12014)', '\n (13)', 'ASS (4816)', 'IST (8048)', 'ANT (12738)', ': (28747)', ' (28705)',
Model: https://huggingface.co/alpindale/WizardLM-2-8x22B Instruct Tag Preset: Vicuna
Format: Instruct Mode Koboldcpp 1.70.1
The text was updated successfully, but these errors were encountered:
I tested that this is not caused by b3028 on llama.cpp, because versions of koboldcpp prior to b3028 already have this issue.
Sorry, something went wrong.
Actually I don't think this is a bug. I'm looking at the mixtral vocab, and [INST] is not a token.
https://huggingface.co/neuralmagic/Mixtral-8x7B-Instruct-v0.1-FP8/raw/main/tokenizer.json https://huggingface.co/Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss/raw/main/tokenizer.json
As far as I can see there are no mixtral models that use that added token.
No branches or pull requests
With Mixtral 8x7b,
[INST]
and[/INST]
are not tokenized correctly.Model: https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1
Instruct Tag Preset: Mistral
Ggerganov also noted this problem on llama.cpp, although there is only happens for Mixtral 8x22b, not 8x7b.
ggml-org#7969 (comment)
With WizardLM-2-8x22B, this also happens with
USER:
andASSISTANT:
.Model: https://huggingface.co/alpindale/WizardLM-2-8x22B
Instruct Tag Preset: Vicuna
Format: Instruct Mode
Koboldcpp 1.70.1
The text was updated successfully, but these errors were encountered: