Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow missing layers in models when loading checkpoint #380

Merged
merged 3 commits into from
Dec 16, 2024

Conversation

kawaho
Copy link
Contributor

@kawaho kawaho commented Dec 13, 2024

This PR allows users to load checkpoints with missing layers compared to the model defined so one can load pre-trained weight when adding more layers to the model.

However, optimizer checkpoint is not handled yet because I imagine we most likely will freeze the pre-trained weight and just train on whatever new layers we are adding. Of course, this feature can be extended to the optimizer checkpoint as well (with more work).

mlpf/model/training.py Outdated Show resolved Hide resolved
@jpata
Copy link
Owner

jpata commented Dec 16, 2024

Can you also check pre-commit? I think it might have detected some issues:

https://github.com/jpata/particleflow/actions/runs/12338808099/job/34440017254?pr=380

mlpf/model/training.py:920:13: F841 local variable 'e' is assigned to but never used
mlpf/model/training.py:929:33: F541 f-string is missing placeholders
mlpf/model/training.py:932:33: F541 f-string is missing placeholders

You can also run it locally: https://pre-commit.com/#install

@kawaho
Copy link
Contributor Author

kawaho commented Dec 16, 2024

sorry about that. Should be ok now.

@jpata
Copy link
Owner

jpata commented Dec 16, 2024

Nice, thank you! Merging.

@jpata jpata merged commit 944891d into jpata:main Dec 16, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants