You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried p_tqdm to do multiprocessing within a function. This works extremely slowly:
importspacyfrompathos.poolsimportThreadPoolasPoolimporttimefromp_tqdmimportp_map# Install with python -m spacy download es_core_news_smnlp=spacy.load("es_core_news_sm")
defpreworker(text, nlp):
return [w.lemma_forwinnlp(text)]
worker=lambdatext: preworker(text, nlp)
texts= ["Este es un texto muy interesante en español"] *1000st=time.time()
pool=Pool(3)
r=pool.map(worker, texts)
print(f"Usual pool took {time.time()-st:.3f} seconds")
defout_worker(texts, nlp):
worker=lambdatext: preworker(text, nlp)
pool=Pool(3)
returnpool.map(worker, texts)
st=time.time()
r=out_worker(texts, nlp)
print(f"Pool within a function took {time.time()-st:.3f} seconds")
defout_worker_tqdm(texts, nlp):
worker=lambdatext: preworker(text, nlp)
returnp_map(worker, texts)
st=time.time()
r=out_worker_tqdm(texts, nlp)
print(f"p_tqdm within a function took {time.time()-st:.3f} seconds")
defout_worker2(texts, nlp, pool):
worker=lambdatext: preworker(text, nlp)
returnpool.map(worker, texts)
st=time.time()
pool=Pool(3)
r=out_worker2(texts, nlp, pool)
print(f"Pool passed to a function took {time.time()-st:.3f} seconds")
The output is
Usual pool took 0.052 seconds
Pool within a functiontook 0.062 seconds
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:08<00:00, 1.23it/s]
p_tqdm within a functiontook 8.341 seconds
Pool passed to a functiontook 0.055 seconds
I got the tip of using threadpool instead of the ususal pool (I guess p_tqdm uses the usual pool underneath, but I haven't checked) from pathos author here.
The text was updated successfully, but these errors were encountered:
I tried
p_tqdm
to do multiprocessing within a function. This works extremely slowly:The output is
I got the tip of using threadpool instead of the ususal pool (I guess p_tqdm uses the usual pool underneath, but I haven't checked) from pathos author here.
The text was updated successfully, but these errors were encountered: