-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault when training doc2vec #2942
Comments
Thanks for the detailed report! Can you say a little more about the corpus size? If enabling logging at the INFO level, how much progress is shown before the fault? Is it fast always at the same point? |
The corpus has 9,643,078 documents and 1,099,181,249 total words. I forgot to include that I am running this on Ubuntu 20.04. Attached is the output from setting logging to INFO |
This may be the same issue as #2894 - fixed in the Essentially: instead of using a package from PyPI or Conda repos: do a git checkout; ensure your system has key Ubuntu packages like |
(Also: that bug is only in the |
I have successfully trained my model by installing gensim from github. Thank you :) |
Problem description
When attempting to train doc2vec, gensim segfaults.
Steps/code/corpus to reproduce
I run the code:
I get the output:
When run in gdb I get:
The backtrace I get is:
I can provide the corpus at request.
Versions
The text was updated successfully, but these errors were encountered: