-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fill-mask target for full words not enabled? #17374
Comments
hi @i-am-neo , Fill-mask works at a token level, not words, so you cannot use targets which are multi token. Since |
Thanks @Narsil . I had thought so. No plans to allow full words and regex in your roadmap? |
It's not something that fits the current
In this particular part, Since it is a non trivial problem, we decide to not do it on behalf of users and give an output that is much closer to what the original model does. If simple strategies can be implemented maybe we can add them as opt-in parameters, but so far nothing is being worked on as far as I know. PRs are more than welcome. If you want more background for instance, this PR might be valuable to read (and the linked PRs too); #10222 I would like to point out side note: An easy start solution for regexp is to fetch all tokens in the vocabulary that start with your prefix and use them as targets |
I hear you @Narsil, it sure is non-trivial. In my case, I would like a large-enough LM (for example, Roberta-large) to generate word candidates to start with, given some regex as hints/constraints, without knowing in advance what the best candidates are, except for those hints. My thinking is that the candidates the LM generates would more or less already fit into the context given to the model. Multiple candidates would be ranked post-fill by their scores. Re |
I think there would be a lot of value to be able to do that, but AFAIK there's no simple way to do that with bert-like models. I think the biggest culprit is that models are trained to give independant probabilities, and not joint ones. Solving it might require an entire new training objective.
Disjoint probabilities: (big: 50%, red: 50) (big: 50%, red: 50%) |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
System Info
Who can help?
@Narsil and @LysandreJik (?)
How can one use Roberta for fill-mask to get the full word candidate and its "full" score for Roberta-large? Open to workaround solutions.
My example:
sentence = f"Nitzsch argues against the doctrine of the annihilation of the wicked, regards the teaching of Scripture about eternal {nlp.tokenizer.mask_token} as hypothetical."
Notebook here.
Using pipeline, the output I get is:
The specified target token
damnationdoes not exist in the model vocabulary. Replacing with
Ġdamn.
Thanks.
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
See notebook above.
Expected behavior
I expect to see "damnation" with its score.
The text was updated successfully, but these errors were encountered: