allow missing layers in models when loading checkpoint #380

kawaho · 2024-12-13T13:59:13Z

This PR allows users to load checkpoints with missing layers compared to the model defined so one can load pre-trained weight when adding more layers to the model.

However, optimizer checkpoint is not handled yet because I imagine we most likely will freeze the pre-trained weight and just train on whatever new layers we are adding. Of course, this feature can be extended to the optimizer checkpoint as well (with more work).

mlpf/model/training.py

jpata · 2024-12-16T06:23:48Z

Can you also check pre-commit? I think it might have detected some issues:

https://github.com/jpata/particleflow/actions/runs/12338808099/job/34440017254?pr=380

mlpf/model/training.py:920:13: F841 local variable 'e' is assigned to but never used
mlpf/model/training.py:929:33: F541 f-string is missing placeholders
mlpf/model/training.py:932:33: F541 f-string is missing placeholders

You can also run it locally: https://pre-commit.com/#install

kawaho · 2024-12-16T09:40:23Z

sorry about that. Should be ok now.

jpata · 2024-12-16T10:11:40Z

Nice, thank you! Merging.

allow missing layers in chkpt

ea20120

jpata reviewed Dec 14, 2024

View reviewed changes

mlpf/model/training.py Outdated Show resolved Hide resolved

remove user prompt

bbb2ed2

remove redundant code

139e01e

jpata merged commit 944891d into jpata:main Dec 16, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow missing layers in models when loading checkpoint #380

allow missing layers in models when loading checkpoint #380

kawaho commented Dec 13, 2024

jpata commented Dec 16, 2024

kawaho commented Dec 16, 2024

jpata commented Dec 16, 2024

allow missing layers in models when loading checkpoint #380

allow missing layers in models when loading checkpoint #380

Conversation

kawaho commented Dec 13, 2024

jpata commented Dec 16, 2024

kawaho commented Dec 16, 2024

jpata commented Dec 16, 2024