Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lemmatizing words with "'s" #401

Closed
jessicabowden opened this issue May 31, 2016 · 1 comment
Closed

Lemmatizing words with "'s" #401

jessicabowden opened this issue May 31, 2016 · 1 comment
Labels
lang / en English language data and models

Comments

@jessicabowden
Copy link

Hi,

A small bug I came across in spaCy's lemmatization when using 's in sentences.

Example 1:
In [6]: [token.lemma_ for token in en_nlp("Jane's got a new car")] Out[6]: ['jane', "'", 'get', 'a', 'new', 'car']
Here I'd expect it to extract either "Jane" or "Jane has".

In [7]: [token.lemma_ for token in en_nlp("Jane's my friend")] Out[7]: ['jane', "'s", 'my', 'friend']
Here perhaps "Jane is".

In [8]: [token.lemma_ for token in en_nlp("Jane thinks that's a nice car")] Out[8]: ['jane', 'think', 'that', "'", 'a', 'nice', 'car']
And here just an example of a non-entity token.

Thanks

@ines ines added lang / en English language data and models 🌙 nightly Discussion and contributions related to nightly builds labels Jan 8, 2017
@ines ines added this to the Update lemmatizer and morphology milestone Feb 18, 2017
ines added a commit that referenced this issue Mar 13, 2017
@ines ines closed this as completed in d0b85fa Mar 18, 2017
@ines ines removed the 🌙 nightly Discussion and contributions related to nightly builds label May 7, 2017
@lock
Copy link

lock bot commented May 8, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked as resolved and limited conversation to collaborators May 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lang / en English language data and models
Projects
None yet
Development

No branches or pull requests

3 participants