Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support consolidating sharded checkpoints in PyTorch >= 2.2 #19351

Closed
awaelchli opened this issue Jan 26, 2024 · 1 comment
Closed

Support consolidating sharded checkpoints in PyTorch >= 2.2 #19351

awaelchli opened this issue Jan 26, 2024 · 1 comment
Assignees
Labels
feature Is an improvement or enhancement

Comments

@awaelchli
Copy link
Contributor

awaelchli commented Jan 26, 2024

Description & Motivation

The utility added in #19213 is not compatible yet with PyTorch 2.2.

Pitch

Add support for PyTorch 2.2. Remove the test skip added in #19289 for the test test_save_sharded_and_consolidate_and_load in Fabric and Trainer.

Alternatives

No response

Additional context

A known issue is that the no_dist=True is not working in PyTorch 2.2:
pytorch/pytorch#115591

cc @Borda

@awaelchli awaelchli added feature Is an improvement or enhancement needs triage Waiting to be triaged by maintainers and removed needs triage Waiting to be triaged by maintainers labels Jan 26, 2024
@awaelchli awaelchli self-assigned this Jan 26, 2024
@awaelchli
Copy link
Contributor Author

#19561

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
None yet
Development

No branches or pull requests

1 participant