TypeError: can't pickle Environment objects on Windows/MacOs #14

fortepianissimo opened this issue Sep 26, 2018 · 14 comments


fortepianissimo commented Sep 26, 2018

I'm running under Windows 10, following along the instructions given by the readme document. When trying to retrain the model using this command

python --dataset-type conll2003 train_eval

I ran into the following exception (right after compiling embeddings) - any tips?

Thank you for the wonderful work!

Compiling embeddings... (this is done only one time per embeddings at first launch)
path: d:\Projects\embeddings\glove.840B.300d.txt
100%|████████████████████████████████████████████████████████████████████| 2196017/2196017 [08:06<00:00, 4517.80it/s] embeddings loaded for 2196006 words and 300 dimensions
Layer (type)                    Output Shape         Param #     Connected to
char_input (InputLayer)         (None, None, 30)     0
time_distributed_1 (TimeDistrib (None, None, 30, 25) 2150        char_input[0][0]
word_input (InputLayer)         (None, None, 300)    0
time_distributed_2 (TimeDistrib (None, None, 50)     10200       time_distributed_1[0][0]
concatenate_1 (Concatenate)     (None, None, 350)    0           word_input[0][0]
dropout_1 (Dropout)             (None, None, 350)    0           concatenate_1[0][0]
bidirectional_2 (Bidirectional) (None, None, 200)    360800      dropout_1[0][0]
dropout_2 (Dropout)             (None, None, 200)    0           bidirectional_2[0][0]
dense_1 (Dense)                 (None, None, 100)    20100       dropout_2[0][0]
dense_2 (Dense)                 (None, None, 10)     1010        dense_1[0][0]
chain_crf_1 (ChainCRF)          (None, None, 10)     120         dense_2[0][0]
Total params: 394,380
Trainable params: 394,380
Non-trainable params: 0
Epoch 1/60
Exception in thread Thread-2:
Traceback (most recent call last):
  File "d:\Anaconda3\Lib\", line 916, in _bootstrap_inner
  File "d:\Anaconda3\Lib\", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "d:\Projects\delft\env\lib\site-packages\keras\utils\", line 548, in _run
    with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
  File "d:\Projects\delft\env\lib\site-packages\keras\utils\", line 522, in <lambda>
  File "d:\Anaconda3\Lib\multiprocessing\", line 119, in Pool
  File "d:\Anaconda3\Lib\multiprocessing\", line 174, in __init__
  File "d:\Anaconda3\Lib\multiprocessing\", line 239, in _repopulate_pool
  File "d:\Anaconda3\Lib\multiprocessing\", line 105, in start
    self._popen = self._Popen(self)
  File "d:\Anaconda3\Lib\multiprocessing\", line 322, in _Popen
    return Popen(process_obj)
  File "d:\Anaconda3\Lib\multiprocessing\", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "d:\Anaconda3\Lib\multiprocessing\", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle Environment objects
fortepianissimo commented Sep 26, 2018

Okay - disabling lmdb in embedding-registry.json seems to make that exception go away. BUT now there's another exception:

Epoch 1/60
d:\Projects\delft\env\lib\site-packages\h5py\ FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
d:\Projects\delft\env\lib\site-packages\gensim\ UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "d:\Anaconda3\Lib\multiprocessing\", line 105, in spawn_main
    exitcode = _main(fd)
  File "d:\Anaconda3\Lib\multiprocessing\", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "d:\Projects\delft\utilities\", line 78, in __getattr__
    return getattr(self.model, name)
  File "d:\Projects\delft\utilities\", line 78, in __getattr__
    return getattr(self.model, name)
  File "d:\Projects\delft\utilities\", line 78, in __getattr__
    return getattr(self.model, name)
  [Previous line repeated 328 more times]

pjox commented Sep 26, 2018

Hello! I haven't been able to reproduce the exception in Linux so it might be windows related. I'm trying to get a windows machine in order to try again. In the meanwhile, can you tell us a little more about your set-up? For instance, are you using a GPU? Did you use the requirements-gpu.txt files to set it up? Also, which version of python are you using?


Hi sorry I wasn't very clear about my spec:

  • Windows: Windows 10
  • GPU: Tesla Quadro P4000; yes I did install requirements-gpu.txt
  • Python: 3.6.6 (via Anaconda).

By the way I also solved this error along the way: DLL load failed message when scikit-learn is imported.

The solution is to install numpy‑1.14.6+mkl‑cp36‑cp36m‑win_amd64.whl (depending on the arch and Python version) from

pjox commented Sep 26, 2018

Ok, we had some problems before with Python 3.6, I honestly don't think that the Python version is the problem, but if you have the time, can you try creating a Python 3.5 environment with conda conda create -n myenv python=3.5 and see if you encounter the same problems? As soon as I get to try DeLFT on Windows I'll get back to you.

Ok I set up Python 3.5 (version 3.5.6 via Anaconda) environment and created another env_python35 under delft dir, here are the errors (infinite recursion):

Epoch 1/60
D:\Projects\delft\env_python35\lib\site-packages\h5py\ FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
D:\Projects\delft\env_python35\lib\site-packages\gensim\ UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 106, in spawn_main
    exitcode = _main(fd)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 116, in _main
    self = pickle.load(from_parent)
  File "D:\Projects\delft\utilities\", line 78, in __getattr__
    return getattr(self.model, name)
  File "D:\Projects\delft\utilities\", line 78, in __getattr__
    return getattr(self.model, name)
  File "D:\Projects\delft\utilities\", line 78, in __getattr__
    return getattr(self.model, name)
  File "D:\Projects\delft\utilities\", line 78, in __getattr__
... (more same lines like the above) ...
RecursionError: maximum recursion depth exceeded while calling a Python object
Exception in thread Thread-1:
Traceback (most recent call last):
  File "d:\Anaconda3\envs\python35_env\Lib\", line 914, in _bootstrap_inner
  File "d:\Anaconda3\envs\python35_env\Lib\", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "D:\Projects\delft\env_python35\lib\site-packages\keras\utils\", line 548, in _run
    with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
  File "D:\Projects\delft\env_python35\lib\site-packages\keras\utils\", line 522, in <lambda>
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 118, in Pool
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 174, in __init__
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 239, in _repopulate_pool
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 105, in start
    self._popen = self._Popen(self)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 313, in _Popen
    return Popen(process_obj)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 66, in __init__
    reduction.dump(process_obj, to_child)
  File "d:\Anaconda3\envs\python35_env\Lib\multiprocessing\", line 59, in dump
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

pjox commented Sep 27, 2018

Thanks for the info! I have been looking around and apparently the multiprocessing library works differently on Windows, so this series of errors you are encountering might be caused by that. However I haven't been able to find a Windows machine to test it yet, as soon as I can get hold of one I'll get back to you.

pjox commented Oct 18, 2018

@fortepianissimo I finally got hold of a Windows machine and was able to reproduce the error, could you please comment lines 77 and 78 in the file utilities/, that is, these lines:

def __getattr__(self, name):
    return getattr(self.model, name)

and try again?

Note 1: Please also disable lmdb in embedding-registry.json
Note 2: This is a workaround rather than a fix, I'll work on a definite fix in the future

Also, please let me know if the workaround works!

ghost commented Apr 21, 2019

Hello, I'm new to this. My specs are:

  • OS: Windows 10 Pro 64-bit
  • GPU: NVIDIA 1050Ti-mobile (4 GB) [I've already install tensorflow-gpu as mentioned in requirement.txt]
  • Python 3.7

And I want to ask for 2 things:

  • The first is: How to disable lmdb in embedding-registry.json?
  • The second is: I've already comment lines 77 & 78 in utilities/ but I encountered this problem (NOTE: I even tried to use pickle version 4 but nothing happened):
Using TensorFlow backend.
D:\Anaconda3\envs\ULR\lib\site-packages\gensim\ UserWarning: detected Windows; aliasing chunkize to chunkize_serial
  warnings.warn("detected Windows; aliasing chunkize to chunkize_serial")
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\Anaconda3\envs\ULR\lib\multiprocessing\", line 105, in spawn_main
    exitcode = _main(fd)
  File "D:\Anaconda3\envs\ULR\lib\multiprocessing\", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Edited: For the first question, I've found the answer (set the "embedding-lmdb-path" to "None")

I face the same issue as @Protossnam EOFError: Ran out of input.
Am on Windows 10 with py3.5. Any updates on this?

ghost commented May 23, 2019

@davidlenz Sadly, I had to boot my laptop in Linux (Ubuntu) and run the tool. On Linux, I didn't face that issue. It's maybe the problem with Windows and I also looking forward to hearing new update on this too

Hi all,
An easy workaround would be to disable multiprocessing when running on Windows
To do that you need to pass multiprocessing=False each time a new Sequence object in created in

My 2 cts


lfoppiano commented Mar 16, 2022

I have this issue when the download fails and the database is not correctly initialised I supposed:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/", line 932, in _bootstrap_inner
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/site-packages/keras/utils/", line 744, in _run
    with closing(self.executor_fn(_SHARED_SEQUENCES)) as executor:
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/site-packages/keras/utils/", line 721, in pool_fn
    pool = get_pool_class(True)(
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 212, in __init__
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 303, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 326, in _repopulate_pool_static
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 121, in start
    self._popen = self._Popen(self)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 284, in _Popen
    return Popen(process_obj)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 32, in __init__
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 19, in __init__
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/lfoppiano/opt/anaconda3/envs/delft2/lib/python3.8/multiprocessing/", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'Environment' object

Update: I tried to run again and the database was correctly created (via a local version of glove), however the problem occurs probably due to multithreading...

To reproduce it I used:

python -m delft.applications.citationClassifier train_eval

Update: I'm having this problem with macOS.

@kermitt2 kermitt2 changed the title TypeError: can't pickle Environment objects TypeError: can't pickle Environment objects on Windows Mar 16, 2022
lfoppiano commented Mar 28, 2022

I have the same problem with MacOs.

The solution is to disable the multithreading by setting nb_workers = 0. Depending on the task to be performed it should modified in both sequenceLabelling/ and 172.

@lfoppiano lfoppiano changed the title TypeError: can't pickle Environment objects on Windows TypeError: can't pickle Environment objects on Windows/MacOs May 11, 2022
