Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF model cards #14720

Merged
merged 18 commits into from
Dec 15, 2021
Merged

TF model cards #14720

merged 18 commits into from
Dec 15, 2021

Conversation

Rocketknight1
Copy link
Member

  • Creating model cards for models trained with Keras!
  • Model cards will automatically be created by the PushToHubCallback - the callback will peek at the metrics and accumulate information during the training run to enable this.
  • Calling model.push_to_hub() will not, by default, create a model card. This method is cross-platform (it comes from the PushToHubMixin), and changing its behaviour would negatively affect Trainer users.
  • Instead, Keras models now have a create_model_card() method if users want to create a model card outside of the PushToHubCallback. Because this method can't peek at the training history like the callback can, you need to pass the History object returned by model.fit() to create the card.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Let's avoid storing a token that users may then share by mistake if we can :-)

src/transformers/keras_callbacks.py Outdated Show resolved Hide resolved
src/transformers/keras_callbacks.py Outdated Show resolved Hide resolved
src/transformers/modelcard.py Outdated Show resolved Hide resolved
Comment on lines 74 to 75
# TODO Is it okay to store the hub token as a callback attribute like this?
self.hub_token = hub_token
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you store it, make sure it doesn't appear in the repr (we had that surprise in the TrainingArguments) or something saved automatically.

Copy link
Member Author

@Rocketknight1 Rocketknight1 Dec 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to make this work without storing the hub_token? I could move the repo creation to __init__, which would avoid the need to store the token as a property, but that might surprise users when the repo is made when the callback is made, rather than when training begins.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to do it at init and have a potential error sooner rather than later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set it to clear the hub_token attribute as soon as the repo is created, and made sure it wasn't being printed in the __repr__, so I think this is fairly safe.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why not create the repo at init? Is there some infor we only get later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is that it's surprising for Keras users - Callbacks usually don't do anything visible at init, because they exist to trigger actions at certain points during training. If you think the problems with moving initialization to on_train_begin are too annoying, though, I can create the repo there, I'm sure it's fine!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I find it weird to not do it at init (the Trainer does it there FYI): this method clones the repo where we will work, which is something you do before you execute your script in normal life. If there is an error because the repo already exists and doesn't match the clone_from it will be more obvious if the error is raised at the callback creation rather when launching the training.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, yes. I'll move it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@Rocketknight1
Copy link
Member Author

@sgugger all comments should be addressed and testing looks good. My main remaining concern is that model.push_to_hub does not generate a model card by default, because that method comes from the cross-platform PushToHubMixin. If you want, I can edit that method so that it creates a model card if you pass it a Keras model history? This would leave the behaviour for Trainer unchanged.

@sgugger
Copy link
Collaborator

sgugger commented Dec 13, 2021

I'm fine with both options, so it's really up to what you think is best @Rocketknight1
Also, if you could rebase on master to fix the CI that would be great :-)

@Rocketknight1
Copy link
Member Author

Will do both!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks a lot for all your work on this!

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Do you have an example of what a training would look like, that we could link in the release and in the docs?

@@ -377,6 +383,7 @@ class TrainingSummary:
eval_results: Optional[Dict[str, float]] = None
eval_lines: Optional[List[str]] = None
hyperparameters: Optional[Dict[str, Any]] = None
source: Optional[str] = "trainer"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool addition!

Comment on lines +667 to +672
if tags is None:
tags = ["generated_from_keras_callback"]
elif isinstance(tags, str) and tags != "generated_from_keras_callback":
tags = [tags, "generated_from_keras_callback"]
elif "generated_from_trainer" not in tags:
tags.append("generated_from_keras_callback")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This will be super useful to track!

@Rocketknight1
Copy link
Member Author

@LysandreJik I'm hoping to post an example with it when it's ready, but right now I'm having some issues with the method generating malformed YAML and they're a pain to track down!

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice new test!

@Rocketknight1
Copy link
Member Author

@sgugger @LysandreJik This should be ready to go now - some tests are failing even after rebasing but they have nothing to do with this PR. Okay if I merge?

@sgugger
Copy link
Collaborator

sgugger commented Dec 15, 2021

Yes you can!

@Rocketknight1 Rocketknight1 merged commit 48d4827 into master Dec 15, 2021
@Rocketknight1 Rocketknight1 deleted the tf_model_card branch December 15, 2021 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants