Difference between spacy and Stanford Parser in results #259

rbhood · 2016-02-11T06:44:08Z

I am working on Sentiment Analysis for which I need to find Dependency Parsing relations between words to extract the aspect and its corresponding sentiment word. For this, I have tried spacy as well as Stanford but the relations given by Stanford are more accurate and relevant for my use but spacy is very very fast and I want to use it only.

So below are some examples where there is a problem in spacy:

Sentence: Alice is happy.
Stanford: It provides a direct relationship between alice and happy, so I can use Alice as my aspect while happy is my sentiment word for it.
Spacy: But spacy gives relationship between (alice,is) and (is,happy)

Note: If sentence is something like "Alice likes apples. Then both Stanford and spacy gives the same relationships between (alice,likes) and (likes apples). But with "is,are like these Stanford provides a direct relationship.

Sentence: There is plenty of leg room.
Stanford: Relates plenty with room which was obvious also as plenty is used for leg room.
Spacy: Not able to provide any such relationship.
Sentence: In September, upon return to Toronto, my suitcase was damaged with the zipper mechanism and lock literally torn off.
Stanford: it gives relationships between (mechanism,torn) and (lock,torn).
Spacy: It doesnt provide any relationship between these words but we can see they are directly related.

All Stanford outputs are from the stanford nlp parser site: http://nlp.stanford.edu:8080/parser/index.jsp
as well as from the packages but the packages from Stanford are pretty slow ( very very slow) as compared to spacy.

So, Is there any way to use spacy to give exactly the same parsing output as from Stanford?? It would be of so much help sir.

honnibal · 2016-02-23T23:43:46Z

Hey,

This is a question of the annotation scheme. It's true that the relations spaCy is returning are a bit more low-level. We could post-process the relations to get a similar result to the Stanford ones, and for some purposes this would be better.

Btw, there's a whole can of worms around this sort of topic. Like, if you have "I eat plenty of apples", you probably want a relationship between "eat" and "apples", right? Instead in both our scheme and Stanford's you'll get a relationship between "eat" and "plenty".

There's really a need for a more abstract semantic representation on top of the syntactic parse. I'm not sure Stanford's solution of making the parse more semantic is what I like best. I think there's a need for the syntactic representation. It's just that currently, we don't have semantic role labelling. So it's true that the lower-level nature of spaCy's parse makes it difficult to work with in places.

For cases like 'plenty of leg room', you can improve things by merging the phrase:

>>> from spacy.en import English
>>> nlp = English()
>>> doc = nlp(u'I like plenty of leg room.')
>>> spans = []
>>> for word in doc:
...   if word.text in ('plenty', 'lots', 'heaps', 'all') and word.nbor(1).text == 'of' and len(list(word.subtree)) >= 3:
...     span = doc[word.left_edge.i : word.right_edge.i + 1]...     spans.append(span)
...     spans.append(span)
... >>> 
>>> spans
[plenty of leg room]
>>> spans[0].merge(span.root.tag_, span[2:].root.lemma_, span.root.ent_type_)>>> for word in doc:
...   print(word.text, word.lemma_, word.dep_, word.head.text)
... 
(u'I', u'i', u'nsubj', u'like')
(u'like', u'like', u'ROOT', u'like')
(u'plenty of leg room', u'room', u'dobj', u'like')
(u'.', u'.', u'punct', u'like')

What we're doing here is retokenizing the sentence so that you can get the relationships you need. We set the "lemma" of our new token 'plenty of leg room' to be 'room', and spaCy knows how to forward all the dependencies, so that the new token is attached correctly to 'like'.

rachit221195 · 2017-08-23T16:45:05Z

@honnibal Is there any similar function in Spacy that helps me to get results similar to that in Stanford NLP?
is there any function or any particular method that can help me achieve this:
((u'shot', u'VBD'), u'nsubj', (u'I', u'PRP')), ((u'shot', u'VBD'), u'dobj', (u'elephant', u'NN')), ((u'elephant', u'NN'), u'det', (u'an', u'DT')), ((u'shot', u'VBD'), u'prep', (u'in', u'IN')), ((u'in', u'IN'), u'pobj', (u'sleep', u'NN')), ((u'sleep', u'NN'), u'poss', (u'my', u'PRP$'))]

The code that does this in Stanford NLP is this:

>> from nltk.parse.stanford import StanfordDependencyParser
>> path_to_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser.jar'
>> path_to_models_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser-3.4.1-models.jar'
>> dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)
>> result = dependency_parser.raw_parse('I shot an elephant in my sleep')
>> dep = result.next()
>> list(dep.triples())

I have been searching all over to get something similar but I cannot seem to find it.
I would really appreciate any help on this matter.

RushiLuhar · 2017-09-11T09:27:34Z

@rachit221195 - yes there is. You can use the combination of the .subtree and .head in the Token object to build up a tree representation as you can in the nltk method you describe below.
Iterate through each Token in your span, if it has a subtree, then you can build up a relation between the token and the each token in the subtree. Hope this makes sense.

lock · 2018-05-08T17:27:35Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

rbhood changed the title ~~Difference between spacy and Stanford Parser~~ Difference between spacy and Stanford Parser in results Feb 11, 2016

honnibal added the performance label Oct 20, 2016

ines mentioned this issue Oct 22, 2016

💫 Document workflow: Using the dependency parse / dependency parsing #555

Closed

ines closed this as completed Oct 22, 2016

lock bot locked as resolved and limited conversation to collaborators May 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between spacy and Stanford Parser in results #259

Difference between spacy and Stanford Parser in results #259

rbhood commented Feb 11, 2016

honnibal commented Feb 23, 2016

rachit221195 commented Aug 23, 2017 •

edited

Loading

RushiLuhar commented Sep 11, 2017

lock bot commented May 8, 2018

Difference between spacy and Stanford Parser in results #259

Difference between spacy and Stanford Parser in results #259

Comments

rbhood commented Feb 11, 2016

honnibal commented Feb 23, 2016

rachit221195 commented Aug 23, 2017 • edited Loading

RushiLuhar commented Sep 11, 2017

lock bot commented May 8, 2018

rachit221195 commented Aug 23, 2017 •

edited

Loading