mention_extractor.apply with clear=True fails if it's not the first run #424
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I think this is a regression caused by #381 .
More specifically, #381 removed
fonduer.utils.utils.get_dict_of_stable_id
, which retrieves contexts that are created previously.MentionExtractorUDF.apply
checks if a temporary context would conflict with existing contexts. See below:fonduer/src/fonduer/candidates/mentions.py
Lines 587 to 594 in 0dc064d
Because
dict_of_stable_id
is initialized with an empty dict (iedict_of_stable_id = {}
), it fails to check a temporary context against existing contexts.Before #381,
dict_of_stable_id
is populated asdict_of_stable_id = get_dict_of_stable_id(doc)
.