We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformers
@ArthurZucker
examples
In [1]: import transformers In [2]: t0tt = transformers.AutoTokenizer.from_pretrained('bigscience/T0pp') You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 In [3]: t0tt.add_special_tokens({'bos_token': '[NEWSPECIAL]'}) Out[3]: 1 In [4]: t0tt.save_pretrained('saved-tokenizer') Out[4]: ('saved-tokenizer/tokenizer_config.json', 'saved-tokenizer/special_tokens_map.json', 'saved-tokenizer/spiece.model', 'saved-tokenizer/added_tokens.json', 'saved-tokenizer/tokenizer.json') In [5]: loaded_t0tt = transformers.AutoTokenizer.from_pretrained('saved-tokenizer') Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. In [6]: t0tt.bos_token Out[6]: '[NEWSPECIAL]' In [7]: loaded_t0tt.bos_token Using bos_token, but it is not set yet.
Expected that an added pad_token persists when saving and then reloading
The text was updated successfully, but these errors were encountered:
Thanks will be fixed by #26570
Sorry, something went wrong.
Tokenizer
Successfully merging a pull request may close this issue.
System Info
transformers
version: 4.34.0Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Expected that an added pad_token persists when saving and then reloading
The text was updated successfully, but these errors were encountered: