-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SpaCy NER training example from version 1.5.0 doesn't work in 1.6.0 #773
Comments
TL;DRI made a bug fix to The best fix is to not call What's going onspaCy 1.x uses the Averaged Perceptron algorithm for all its machine learning. You can read about the algorithm in the POS tagger blog post, where you can also find a straight-forward Python implementation: https://explosion.ai/blog/part-of-speech-pos-tagger-in-python AP uses the Averaged Parameter Trick for SGD. There are two copies of the weights:
During training predictions are made with the current weights, and the averaged weights are updated in the background. At the end of training, we swap the current for the averages. This makes a huge difference for most training scenarios. However, when I wrote the code, I didn't pay much attention to the current use-case of "resuming" training, in order to add another class. I recently fixed a long-standing error in the averaged perceptron code: After loading a model, Thinc was not initialising the averages to the newly loaded weights. This saves memory, because the averages require another copy of the weights, and also additional book-keeping. The consequence of this bug was that when you updated a feature after resuming training, you wiped the weights that were previously associated with it. This is really bad --- it means that as you train new examples, you're deleting all the information previously associated with it. I finally fixed this bug in this commit: explosion/thinc@09b030b The consequence of this is that the correction makes the model behave differently on these small-data example cases. What's still unclear is, how should we compute an average between the old weights and the new ones? The old weights were trained on about 20 passes over about 80,000 sentences of annotation. So the new 5 passes over 5 examples shouldn't change the weights at all if we take an unbiased average. This seems undesirable. If you have so little data, it's probably not a good idea to average. About NER and training more generally (making this the megathread)#762 , #612 , #701, #665 . Attn: @savvopoulos, @viksit People are having a lot of pain with training the NER system. Some of the problems are easy to fix --- the current workflow around saving and loading data is pretty bad, and it's made worse by some Python 2/3 unicode save/load bugs in the example scripts. What's hard to solve is that people seem to want to train the NER system on like, 5 examples. The current algorithm expects more like 5,000. I realise I never wrote this anywhere, and the examples all show five examples. I guess I've been doing this stuff too long, and it's no longer obvious to me what is and isn't obvious. I think has been the root cause of a lot of confusion. Things will improve with spaCy 2.0 a little bit. You might be able to get a useful model with as little as 500 or 1,000 sentences annotated with a new NER class. Maybe. We're working on ways to make all of this more efficient. We're working on making annotation projects less expensive and more consistent, and we're working on algorithms that require fewer annotated examples. But there will always be limits. The thing is...I think most teams should be annotating literally 10,000x as much data as they're currently trying to get away with. You should have at least 1,000 sentences just of evaluation data, that your machine learning model never sees. Otherwise how will you know that your system is working? By typing stuff into it, manually? You wouldn't test your other code like that, would you? :) |
Are there alternative models that are more robust with respect to smaller datasets? Playing with luis.ai and wit.ai, their NERs seem to handle smaller datasets, but I'm not sure what they're using behind the scenes. Their models retrain pretty quickly, so they're likely not complex. |
@etchen99 : Neural network models will do better at this, because we'll be able to use transfer learning --- we can import knowledge from other tasks, about the language in general. That helps a lot when you don't have much data. But, again: "not much data" is here "a few thousand sentences". I get that people want to train on a few dozen sentences. I think people shouldn't want that. Annotated data will never not be a part of this type of machine learning, no matter what algorithm you're using --- because you're always going to need evaluation data. That won't ever change. If you're making a few thousand sentences of evaluation data, you may as well make a few thousand more for training. |
@honnibal Currently, the example code of training and updating NER in the document only use 2 sentences, which is obviously not enough (I realize it after reading your comment). I think if you put your explanation in the document, that will be better. Everyone tries to read the doc to learn something, they go to the issues only if they could not find what they want in the doc. More problems about the example code
>>> # after running the example code, it does not work
>>> nlp(u'Who is Chaka Khan?').ents
() |
According to this repo, I did find a way to update the original NER model. However, it does not support training new entities. Example of training to extract the degress: nlp = spacy.load('en')
ner = nlp.entity
text, tags = (u'B.S. in Mathmatics', [(0, 4, 'DEGREE')])
doc = nlp.make_doc(text)
gold = GoldParse(doc, entities=tags)
ner.update(doc, gold)
|
The bugs around this should now be resolved, as of 1.7.3. See further discussion in #910. Usability around the retrained NER still isn't great, but the situation is improving. This will be fully resolved once:
All of these things are underway in other threads, so I'll close this one. |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I tried to use the training example here:
https://github.com/explosion/spaCy/blob/master/examples/training/train_ner.py
with SpaCy 1.6.0. I get results like this:
The tagging is odd, and from Khan is recognized as a LOC and Berlin as a PERSON. If I back up to version 1.5.0, the result is as expected:
Could this be an issue with the off the shelf English model that spacy.en.download 1.6.0 fetched?
The text was updated successfully, but these errors were encountered: