Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable automatic checkpoint loading #368

Closed
theSoenke opened this issue Oct 14, 2019 · 8 comments · Fixed by #1017
Closed

Disable automatic checkpoint loading #368

theSoenke opened this issue Oct 14, 2019 · 8 comments · Fixed by #1017
Assignees
Labels
feature Is an improvement or enhancement
Milestone

Comments

@theSoenke
Copy link

Is your feature request related to a problem? Please describe.
The last checkpoint is being automatically restored when a checkpoint exists. This is an issue for me when the model has previously been trained with different settings or I want to train a network from scratch.

If I want to use a checkpoint I would prefer to explicitly pass the checkpoint that should be used.

Describe the solution you'd like
Provide an option to disable the automatic loading of checkpoints to the trainer.

Describe alternatives you've considered
An alternative would be to modify ModelCheckpoint class and add that option there so that only that class is responsible for checkpoint creation and loading.

@theSoenke theSoenke added feature Is an improvement or enhancement help wanted Open to be worked on labels Oct 14, 2019
@williamFalcon
Copy link
Contributor

williamFalcon commented Oct 23, 2019

@theSoenke the previous checkpoint is only used if the experiment has the same version. Is that what you mean?

The reason for this is that the logger object creates an experiment for every version you run. In that folder it stores all checkpoints. When you want to continue training you set the version to the one you want and it just works. If you want to train a different model then you'd need to pass in a different version or let the logger automatically go to the next version.

@neggert this is the main reason for this linking. 99% of the time we want to restore a model attached to a specific version. since the versions are handled by the logger, it makes sense to couple them for this purpose. However, it should not limit the flexibility if someone wants to load a different checkpoint.

case 1:
Goal: train v1 and restore v1.
If logger(version=1) this behavior is already built in.

case 2:
Goal train v1, train v2, load v2.
Again Logger(version=2) solves this.

case 3:
Goal: train v1, load different weights for the same hyperparams.
Not sure if this is supported yet, (i think it is, but haven't looked at this case in a while)

@neggert
Copy link
Contributor

neggert commented Oct 23, 2019

Hmm, okay. I think this should work fine for test tube, even with the changes in #413. This has never worked with MLFlow, since MLFlow generates its own "version". I'll have to give some thought as to how to reproduce this behavior in MLFlow.

@williamFalcon
Copy link
Contributor

@theSoenke has this been fixed on master? @neggert

@Borda Borda removed the help wanted Open to be worked on label Feb 3, 2020
@Ir1d
Copy link
Contributor

Ir1d commented Feb 27, 2020

Shall we add an option to disable the automatic loading? I sometimes have to run some old version exp, the loading disturbs a lot.

@williamFalcon
Copy link
Contributor

I kind of want to remove the automatic loading...
@ethanwharris @hadim thoughts?

@williamFalcon williamFalcon added this to the 0.6.1 milestone Feb 27, 2020
@hadim
Copy link
Contributor

hadim commented Feb 27, 2020

I agree. Loading should be done only on demand IMO.

@ethanwharris
Copy link
Member

Yeah, I'm on board with this - perhaps should still throw a warning if a checkpoint is going to be overwritten?

@Ir1d
Copy link
Contributor

Ir1d commented Feb 27, 2020

For the time being, is there a recommended way to override this functionality?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
None yet
7 participants