Lexemes are unhashable (v0.101.0) #371

bwegge · 2016-05-12T08:31:49Z

When I try to add Lexemes to a set or dict, it fails since Lexemes are unhashable:

cat = nlp.vocab['cat']
dog = nlp.vocab['dog']
my_animals = {cat, dog}

Traceback (most recent call last):

  File "<ipython-input-30-8ffec97fae23>", line 1, in <module>
    my_animals = {cat, dog}

TypeError: unhashable type: 'spacy.lexeme.Lexeme'

Maybe lexeme.orth can be used (together with lexeme.lang) as hash value?

Another funny observation is that looking up the same word multiple times through nlp.vocab[word] produces Lexemes at different addresses (although comparison works thanks to the newly implemented rich comparison):

nlp.vocab['cat']
Out[17]: <spacy.lexeme.Lexeme at 0xe865401e10>

nlp.vocab['cat']
Out[18]: <spacy.lexeme.Lexeme at 0xe865401d80>

The text was updated successfully, but these errors were encountered:

honnibal · 2016-05-12T08:41:18Z

To save memory, the Lexeme class is a wrapper around the LexemeC struct. So the Python objects are indeed created afresh each time. You can see the implementation here: https://github.com/spacy-io/spaCy/blob/master/spacy/lexeme.pyx#L31

Adding a __hash__ method is a good idea though. Will do.

bwegge · 2016-05-12T08:54:39Z

Sounds reasonable, thanks for the explanation!

lylebrown · 2016-05-15T22:02:05Z

Is there a workaround for this in the meantime? I'm new to NLP and trying to follow this guide, specifically the part where it mentions word vector representations.

jr-pe · 2016-07-12T13:38:29Z

@lylebrown
Replace the curly braces ({ }) with square brackets ([ ]) in the following line:

allWords = list({w for w in parser.vocab if w.has_vector and w.orth_.islower() and w.lower_ != "nasa"})

syllog1sm · 2016-07-12T14:24:11Z

Btw the line should probably be:

allWords = [w for w in parser.vocab if w.has_vector and w.is_lower and w.lower_ != "nasa"]

The old .repvec property is now named .vector, too.

The __hash__ method will be there in the next release.

lock · 2018-05-09T11:12:03Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

honnibal added this to the Version 1.0 Release milestone Sep 24, 2016

honnibal added a commit that referenced this issue Sep 27, 2016

Fix Issue #371: Lexeme objects were unhashable.

e233328

honnibal closed this as completed Sep 27, 2016

lock bot locked as resolved and limited conversation to collaborators May 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lexemes are unhashable (v0.101.0) #371

Lexemes are unhashable (v0.101.0) #371

bwegge commented May 12, 2016

honnibal commented May 12, 2016

bwegge commented May 12, 2016

lylebrown commented May 15, 2016

jr-pe commented Jul 12, 2016 •

edited

Loading

syllog1sm commented Jul 12, 2016

lock bot commented May 9, 2018

Lexemes are unhashable (v0.101.0) #371

Lexemes are unhashable (v0.101.0) #371

Comments

bwegge commented May 12, 2016

honnibal commented May 12, 2016

bwegge commented May 12, 2016

lylebrown commented May 15, 2016

jr-pe commented Jul 12, 2016 • edited Loading

syllog1sm commented Jul 12, 2016

lock bot commented May 9, 2018

jr-pe commented Jul 12, 2016 •

edited

Loading