You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've tried this basic code on both Linux and Windows. I'm trying to do some online training and it seems like after a couple passes it throws a seg fault.
Code to recreate problem.
from gensim.models.doc2vec import Doc2Vec, LabeledSentence, TaggedDocument
sentences = [('food', 'I like to eat broccoli and bananas.'),
('food', 'I ate a banana and spinach smoothie for breakfast.'),
('animals', 'Chinchillas and kittens are cute.'),
('animals', 'My sister adopted a kitten yesterday.'),
('animals', 'Look at this cute hamster munching on a piece of broccoli.')]
convSentences = []
for s in sentences:
convSentences.append(LabeledSentence(tags=[s[0]], words = s[1].split()))
model = Doc2Vec(size=300, window=8, min_count=1, workers=1)
print("Pass 1:")
model.build_vocab([convSentences[0]])
model.train([convSentences[0]], total_examples=model.corpus_count)
print("Pass 2:")
model.build_vocab([convSentences[1]], update=True)
model.train([convSentences[1]], total_examples=model.corpus_count)
print("Pass 3:")
model.build_vocab([convSentences[2]], update=True)
model.train([convSentences[2]], total_examples=model.corpus_count)
print("Pass 4:")
model.build_vocab([convSentences[3]], update=True)
model.train([convSentences[3]], total_examples=model.corpus_count)
print("Pass 5:")
model.build_vocab([convSentences[4]], update=True)
model.train([convSentences[4]], total_examples=model.corpus_count)
Here's the output running in Windows Idle. Python 3.5.2
Warning (from warnings module):
File "C:\Python35\lib\site-packages\gensim\utils.py", line 855
warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
UserWarning: detected Windows; aliasing chunkize to chunkize_serial
Pass 1:
Pass 2:
Pass 3:
Passes 1-3 go quick, then a long pause and Linux throws a segmentation fault, Windows throws an unspecified error.
The text was updated successfully, but these errors were encountered:
Duplicate of #1019 – but this is a very useful minimal triggering case, thank you! I'll be closing this as a duplicate, for further discussion to occur there.
FYI, build_vocab(..., update=True) vocabulary-expansion feature was only developed & tested with respect to Word2Vec – thus this sort of bug when used via inheritance in Doc2Vec.
I've tried this basic code on both Linux and Windows. I'm trying to do some online training and it seems like after a couple passes it throws a seg fault.
Code to recreate problem.
Here's the output running in Windows Idle. Python 3.5.2
Passes 1-3 go quick, then a long pause and Linux throws a segmentation fault, Windows throws an unspecified error.
The text was updated successfully, but these errors were encountered: