Skip to content

Commit

Permalink
only main process should call _save on deepspeed zero3 (huggingface#2…
Browse files Browse the repository at this point in the history
…5959)

only main process should call _save when deepspeed zero3
  • Loading branch information
zjjMaiMai authored and parambharat committed Sep 26, 2023
1 parent ea1e4a0 commit 1d22f73
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/transformers/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -2850,7 +2850,8 @@ def save_model(self, output_dir: Optional[str] = None, _internal_call: bool = Fa
" stage3_gather_16bit_weights_on_model_save=false. Saving the full checkpoint instead, use"
" zero_to_fp32.py to recover weights"
)
self._save(output_dir, state_dict={})
if self.args.should_save:
self._save(output_dir, state_dict={})
# remove the dummy state_dict
remove_dummy_checkpoint(self.args.should_save, output_dir, [WEIGHTS_NAME, SAFE_WEIGHTS_NAME])
self.model_wrapped.save_checkpoint(output_dir)
Expand Down

0 comments on commit 1d22f73

Please sign in to comment.