-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System requirements/performance: 45 secs for the POS example? #763
Comments
I believe the bottleneck is loading the GloVe vectors. He manually demarshals all 1 million 300d vectors regardless of whether they're used. I'm working on reducing the loading overhead to be a function of the size of the vocabulary that is actually used. |
The load time is currently a significant problem. You can make things better by setting The good news is that this is all overhead — once loaded the tagger is very fast. So on real usage you'll be able to process a lot of text. |
Disabling the parser does not change anything for me.
still takes 41 to 52 seconds for that simple sentence. |
Sorry, to hear that. I'm back to focusing on this issue. I hope to have something that @honnibal can use in a few days. Then it's a question of when he can get the time to integrate it. TL;DR - even if you need GloVe vectors this shouldn't be a problem for too much longer. |
Closing this – the new version supports a smaller model for faster loading! |
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Are there some kind of system requirements for running SpaCy or is there anything wrong with my system config? The example POS tagging script takes 45 secs to finish on my 2 core VPS (4 GB RAM, Ubuntu 16.04, Python 2.7, spaCy 1.6.0 using the German model for the POS tagging example script and a test sentence of 9 words.)
Is this a general system performance issue or an issue with not using Python 3?
Are there any recommendations (CPU and memory-wise) I should use when I like to use spaCy for a "just in time" POS tagging?
The text was updated successfully, but these errors were encountered: