Python v0.2.0
In this release, we fixed some inconsistencies between the BPETokenizer
and the original python version of this tokenizer. If you created your own vocabulary using this Tokenizer, you will need to either train a new one, or use a modified version, where you set the PreTokenizer
back to Whitespace
(instead of WhitespaceSplit
).