-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load local tokenizer #116
Comments
if you replace |
Yeah, I'm currently using from_file and it works fine |
I will add a check then. Are you passing the path to the folder or to the json file directly? |
json file |
Also maybe we should add option for BPE tokenizer in MinhashDedup? |
you mean instead of |
Yeah, maybe we could support both by changing the function a bit? |
Could we use |
I would really like to avoid the dependency on |
Take a look at the linked PR |
Due to some network issues, I need to first download and load the tokenizer from local path. But the current tokenizer only supports identifier-based loading from hf. Is it possible to add a local load from path function like AutoTokenizer in
transformers
lib?The text was updated successfully, but these errors were encountered: