-
Notifications
You must be signed in to change notification settings - Fork 27.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF model cards #14720
TF model cards #14720
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this! Let's avoid storing a token that users may then share by mistake if we can :-)
src/transformers/keras_callbacks.py
Outdated
# TODO Is it okay to store the hub token as a callback attribute like this? | ||
self.hub_token = hub_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you store it, make sure it doesn't appear in the repr (we had that surprise in the TrainingArguments) or something saved automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to make this work without storing the hub_token? I could move the repo creation to __init__
, which would avoid the need to store the token as a property, but that might surprise users when the repo is made when the callback is made, rather than when training begins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to do it at init and have a potential error sooner rather than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set it to clear the hub_token
attribute as soon as the repo is created, and made sure it wasn't being printed in the __repr__
, so I think this is fairly safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why not create the repo at init? Is there some infor we only get later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason is that it's surprising for Keras users - Callbacks usually don't do anything visible at init, because they exist to trigger actions at certain points during training. If you think the problems with moving initialization to on_train_begin
are too annoying, though, I can create the repo there, I'm sure it's fine!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I find it weird to not do it at init (the Trainer does it there FYI): this method clones the repo where we will work, which is something you do before you execute your script in normal life. If there is an error because the repo already exists and doesn't match the clone_from
it will be more obvious if the error is raised at the callback creation rather when launching the training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, yes. I'll move it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
@sgugger all comments should be addressed and testing looks good. My main remaining concern is that |
I'm fine with both options, so it's really up to what you think is best @Rocketknight1 |
Will do both! |
9f34944
to
4b5fce4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks a lot for all your work on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Do you have an example of what a training would look like, that we could link in the release and in the docs?
@@ -377,6 +383,7 @@ class TrainingSummary: | |||
eval_results: Optional[Dict[str, float]] = None | |||
eval_lines: Optional[List[str]] = None | |||
hyperparameters: Optional[Dict[str, Any]] = None | |||
source: Optional[str] = "trainer" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool addition!
if tags is None: | ||
tags = ["generated_from_keras_callback"] | ||
elif isinstance(tags, str) and tags != "generated_from_keras_callback": | ||
tags = [tags, "generated_from_keras_callback"] | ||
elif "generated_from_trainer" not in tags: | ||
tags.append("generated_from_keras_callback") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This will be super useful to track!
4b5fce4
to
e80a299
Compare
@LysandreJik I'm hoping to post an example with it when it's ready, but right now I'm having some issues with the method generating malformed YAML and they're a pain to track down! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice new test!
Good job, Matt.
4c99cc7
to
355b74a
Compare
@sgugger @LysandreJik This should be ready to go now - some tests are failing even after rebasing but they have nothing to do with this PR. Okay if I merge? |
Yes you can! |
PushToHubCallback
- the callback will peek at the metrics and accumulate information during the training run to enable this.model.push_to_hub()
will not, by default, create a model card. This method is cross-platform (it comes from thePushToHubMixin
), and changing its behaviour would negatively affectTrainer
users.create_model_card()
method if users want to create a model card outside of thePushToHubCallback
. Because this method can't peek at the training history like the callback can, you need to pass theHistory
object returned bymodel.fit()
to create the card.