You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently getting the error listed below indicating that I need to have Ninja installed when running deepspeed despite having Ninja 1.10.2 installed. I also tried installing Ninja 1.8 and Ninja 1.9 and followed the steps outlined here yet I still face the same error.
Here's the full traceback:
Using /workspace/.cache/torch_extensions as PyTorch extensions root...
Traceback (most recent call last):
File "transformers/examples/pytorch/summarization/run_summarization.py", line 651, in <module>
main()
File "transformers/examples/pytorch/summarization/run_summarization.py", line 573, in main
train_result = trainer.train(resume_from_checkpoint=checkpoint)
File "/workspace/.local/lib/python3.6/site-packages/transformers/trainer.py", line 1166, in train
self, num_training_steps=max_steps, resume_from_checkpoint=resume_from_checkpoint
File "/workspace/.local/lib/python3.6/site-packages/transformers/deepspeed.py", line 426, in deepspeed_init
deepspeed_engine, optimizer, _, lr_scheduler = deepspeed.initialize(**kwargs)
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/__init__.py", line 129, in initialize
config_params=config_params)
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/runtime/engine.py", line 294, in __init__
self._configure_optimizer(optimizer, model_parameters)
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/runtime/engine.py", line 1132, in _configure_optimizer
self.optimizer = self._configure_zero_optimizer(basic_optimizer)
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/runtime/engine.py", line 1397, in _configure_zero_optimizer
communication_data_type=self.communication_data_type)
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/runtime/zero/stage2.py", line 131, in __init__
util_ops = UtilsBuilder().load()
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/ops/op_builder/builder.py", line 370, in load
return self.jit_load(verbose)
File "/workspace/.local/lib/python3.6/site-packages/deepspeed/ops/op_builder/builder.py", line 409, in jit_load
verbose=verbose)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1091, in load
keep_intermediates=keep_intermediates)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1302, in _jit_compile
is_standalone=is_standalone)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1373, in _write_ninja_file_and_build_library
verify_ninja_availability()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 1429, in verify_ninja_availability
raise RuntimeError("Ninja is required to load C++ extensions")
RuntimeError: Ninja is required to load C++ extensions
I'm using the following libraries: torch: 1.8.0 transformers: 4.15.0.dev0 deepspeed: 0.5.8 ninja: 1.10.2.3
Hi,
I'm currently getting the error listed below indicating that I need to have Ninja installed when running deepspeed despite having Ninja 1.10.2 installed. I also tried installing Ninja 1.8 and Ninja 1.9 and followed the steps outlined here yet I still face the same error.
Here's the full traceback:
I'm using the following libraries:
torch: 1.8.0
transformers: 4.15.0.dev0
deepspeed: 0.5.8
ninja: 1.10.2.3
The original code I was trying to run was:
The text was updated successfully, but these errors were encountered: