You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the README, we were asked to report unsupported models. The supported_models.yaml lists microsoft/phi-1_5 as supported.
However, when I run:
python cfg_generate.py -m "microsoft/phi-1_5"'You would represent "My dog Sparky is 7 years old and weighs 21 kg" in JSON as '
Where cfg_generate.py is only a minor modification from the code in the README:
#!/usr/bin/env python# -*- coding: utf-8 -*-importargparseimporttorchfromtransformersimportAutoModelForCausalLM, AutoTokenizerfromtransformers_cfg.grammar_utilsimportIncrementalGrammarConstraintfromtransformers_cfg.generation.logits_processimport (
GrammarConstrainedLogitsProcessor,
)
DEFAULT_GRAMMAR="examples/grammars/json.ebnf"DEFAULT_MODEL="openai-community/gpt2-xl"defmain():
parser=argparse.ArgumentParser(
description="Generate text using a specified model and grammar."
)
parser.add_argument(
"-m",
"--model_id",
default=DEFAULT_MODEL,
help=f"The ID of the model to use. (default: {DEFAULT_MODEL})",
)
parser.add_argument(
"-g",
"--grammar_file",
default=DEFAULT_GRAMMAR,
help=f"The path to the grammar file. (default: {DEFAULT_GRAMMAR})",
)
parser.add_argument(
"prompts", nargs="+", help="The prompts to use for generation."
)
args=parser.parse_args()
device=torch.device("cpu")
print(f"Using device: {device}")
# Load model and tokenizertokenizer=AutoTokenizer.from_pretrained(args.model_id)
tokenizer.pad_token=tokenizer.eos_tokenmodel=AutoModelForCausalLM.from_pretrained(args.model_id).to(device)
model.generation_config.pad_token_id=model.generation_config.eos_token_id# Load json grammarwithopen(args.grammar_file, "r") asfile:
grammar_str=file.read()
grammar=IncrementalGrammarConstraint(grammar_str, "root", tokenizer)
grammar_processor=GrammarConstrainedLogitsProcessor(grammar)
# Generateinput_ids=tokenizer(
args.prompts,
add_special_tokens=False,
return_tensors="pt",
padding=True,
)["input_ids"]
output=model.generate(
input_ids,
max_length=50,
logits_processor=[grammar_processor],
repetition_penalty=1.1,
num_return_sequences=1,
)
# decode outputgenerations=tokenizer.batch_decode(output, skip_special_tokens=True)
print(generations)
if__name__=="__main__":
main()
I get:
Using device: cpu
tokenizer_config.json: 100%|███████████████████| 237/237 [00:00<00:00, 4.06MB/s]
vocab.json: 100%|████████████████████████████| 798k/798k [00:00<00:00, 10.9MB/s]
merges.txt: 100%|████████████████████████████| 456k/456k [00:00<00:00, 9.77MB/s]
tokenizer.json: 100%|██████████████████████| 2.11M/2.11M [00:00<00:00, 15.0MB/s]
added_tokens.json: 100%|███████████████████| 1.08k/1.08k [00:00<00:00, 21.9MB/s]
special_tokens_map.json: 100%|███████████████| 99.0/99.0 [00:00<00:00, 1.64MB/s]
config.json: 100%|█████████████████████████████| 864/864 [00:00<00:00, 15.9MB/s]
pytorch_model.bin: 100%|███████████████████| 2.84G/2.84G [03:33<00:00, 13.3MB/s]
generation_config.json: 100%|████████████████| 74.0/74.0 [00:00<00:00, 1.07MB/s]
WARNING:transformers_cfg.vocab_struct:Warning: unrecognized tokenizer: using default token formatting
Traceback (most recent call last):
File "/home/eric/Prj/cfg_llm_security/cfg_generate.py", line 73, in <module>
main()
File "/home/eric/Prj/cfg_llm_security/cfg_generate.py", line 59, in main
output = model.generate(
^^^^^^^^^^^^^^^
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/transformers/generation/utils.py", line 1479, in generate
return self.greedy_search(
^^^^^^^^^^^^^^^^^^^
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/transformers/generation/utils.py", line 2353, in greedy_search
next_tokens_scores = logits_processor(input_ids, next_token_logits)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/transformers/generation/logits_process.py", line 97, in __call__
scores = processor(input_ids, scores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/transformers_cfg/generation/logits_process.py", line 102, in __call__
return self.process_logits(input_ids, scores)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/transformers_cfg/generation/logits_process.py", line 95, in process_logits
self.mask_logits(scores, scores.device)
File "/home/eric/venv/cfg_llm_security/lib/python3.11/site-packages/transformers_cfg/generation/logits_process.py", line 57, in mask_logits
logits[~acceptance] = -math.inf
~~~~~~^^^^^^^^^^^^^
IndexError: The shape of the mask [1, 50295] at index 1 does not match the shape of the indexed tensor [1, 51200] at index 1
Because of the following line in the errors:
WARNING:transformers_cfg.vocab_struct:Warning: unrecognized tokenizer: using default token formatting
I suspect that the problem is that the tokenizer selected by the AutoTokenizer has changed from what your code is expecting. The same thing happens for phi-2.
I am working on a time-sensitive project right now, so I won't be helping more than just reporting the bug. Feel free to close this as "won't do;" I won't feel bad. I've already received the rest of your work for free. (And if I can't complete the project without fixing it, I might submit a pull request.)
The text was updated successfully, but these errors were encountered:
Hello @RadixSeven
Thank you for reporting this error! I understand the cause of the error and will incorporate a solution and explanation in this discussion. Another user has reported the same issue with T5 as well.
reason
The problem stems from a deliberate design decision in Phi (and similar models like T5), involving a discrepancy between the tokenizer vocabulary (50295) and the model embedding size (51200). This discrepancy allows for the future addition of special tokens.
fix
To resolve this, adding model.resize_token_embeddings(len(tokenizer)) before performing inference will solve the issue. This is an appropriate solution.
I get following output with you example: 'You would represent "My dog Sparky is 7 years old and weighs 21 kg" in JSON as {"name":["Dog","Sparky"],"age":[7,21],"weight":[21]}'
We will soon update our code to handle this automatically, so users won't have to manage it on their own.
More details:
There are two related discussions from HF community:
In the README, we were asked to report unsupported models. The supported_models.yaml lists microsoft/phi-1_5 as supported.
However, when I run:
Where cfg_generate.py is only a minor modification from the code in the README:
I get:
Because of the following line in the errors:
I suspect that the problem is that the tokenizer selected by the AutoTokenizer has changed from what your code is expecting. The same thing happens for phi-2.
I am working on a time-sensitive project right now, so I won't be helping more than just reporting the bug. Feel free to close this as "won't do;" I won't feel bad. I've already received the rest of your work for free. (And if I can't complete the project without fixing it, I might submit a pull request.)
The text was updated successfully, but these errors were encountered: