Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spaCy integration ignores old entities #17

Closed
davidberenstein1957 opened this issue Jun 28, 2023 · 2 comments
Closed

spaCy integration ignores old entities #17

davidberenstein1957 opened this issue Jun 28, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@davidberenstein1957
Copy link
Contributor

I think it might not be best practice to completely overwrite the previously obtained entities, maybe something like the code underneath would work better.

from spacy.util import filter_spans

doc.set_ents(filter_spans(list(doc.ents) + new_ents))
@tomaarsen tomaarsen added the enhancement New feature or request label Jun 28, 2023
@tomaarsen
Copy link
Owner

I'll look into adding this as a toggle, off by default. Thanks for the suggestion!

davidberenstein1957 added a commit to davidberenstein1957/SpanMarkerNER that referenced this issue Jul 12, 2023
tomaarsen added a commit that referenced this issue Aug 24, 2023
* added `pipe()` to spaCy integration

* added spaCy `.pipe()` integration tests

* chore: avoid overwriting pre-existing entities #17

* chore: disable removing NER pipeline by default

* chore: added batch size warning

* chore: added overwrite_entities flag
chore: removed warning
chore: updated changelog

* fix: resolved small typo

* Small refactor + formatting

* Removed Optional from `overwrite_entities`
* Introduce `set_ents` method to prevent duplicate code

* Update changelog

* Update documentation with overwrite_entities

* Reintroduce accidentally removed "Fixed" header

* Prefer SpanMarker outputs over spaCy outputs

---------

Co-authored-by: Tom Aarsen <[email protected]>
@tomaarsen
Copy link
Owner

Implemented in #16 and released in 1.3.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants