-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong tokenization of "Shell" as "She, ll" #775
Comments
Ah, damn, this should have be added to the excluded tokenizer exceptions in The |
Thanks! :) |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
gives the following output:
"Shell" should not be tokenized into ["She", "ll"].
Could not even find this mapping in
tokenizer/specials.json
Any way to fix this?
The text was updated successfully, but these errors were encountered: