-
Notifications
You must be signed in to change notification settings - Fork 25.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto modelcard #11599
Auto modelcard #11599
Conversation
@@ -516,7 +516,12 @@ def compute_metrics(p: EvalPrediction): | |||
writer.write(f"{index}\t{item}\n") | |||
|
|||
if training_args.push_to_hub: | |||
trainer.push_to_hub() | |||
kwargs = {"finetuned_from": model_args.model_name_or_path} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we could directly define the finetuned_from
model from the Trainer
? Or is it not a good idea because some models could just have been pre-trained and not fine-tuned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Trainer
gets a model, it has no idea what the checkpoint used was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think the subsections in the .md
file need to be shifted one to the left:
\n ##
-> \n##
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: the deprecated ModelCard
was set to be deprecated in February 2020, and hasn't been used since.
The result looks super cool! Love the model card resulting from the training.
As said offline, would be really cool to have the metadata be populated as well, as this would allow programmatic handling of checkpoints and would open the door to a myriad of features, paving the way for model evaluation.
I understand this may require some changes in the datasets
lib, pinging @lhoestq as discussed offline.
Otherwise, LGTM!
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, I think this is a fantastic addition. To make it easier on reviewers, here's an example of the modelcard once uploaded to the hub:
https://huggingface.co/sgugger/tst-glue-mrpc/blob/main/README.md
An example of the metadata generated is visible here:
---
tags:
- text-classification
datasets:
- glue
metrics:
- accuracy
- f1
model-index:
- name: tst-glue-mrpc
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: GLUE MRPC
type: glue
metrics:
- name: Accuracy
type: accuracy
value: 0.8529411764705882
- name: F1
type: f1
value: 0.8969072164948454
---
This follows the format defined in huggingface/huggingface_hub#39.
Might be of interest to @lewtun, @lhoestq
Let's go ahead and implement the remaining tasks! 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great job @sgugger!
def __post_init__(self): | ||
# Infer default license from the checkpoint used, if possible. | ||
if self.license is None and not is_offline_mode() and self.finetuned_from is not None: | ||
try: | ||
model_info = HfApi().model_info(self.finetuned_from) | ||
for tag in model_info.tags: | ||
if tag.startswith("license:"): | ||
self.license = tag[8:] | ||
except requests.exceptions.HTTPError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool to inherit the license!
* Autogenerate model cards from the Trainer * ModelCard deprecated * Fix test * Style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments * Quality * With all metadata * Metadata * Post-merge conflict mess * Data args and all examples * Default license and languages when possible Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
What does this PR do?
This PR adds functionality in the Trainer to auto-generate model cards and some utilities to do the same without the Trainer if people are not using it. In passing, the old
ModelCard
class is deprecated (to be removed in v5).As an example here is a repo that is generated by the
run_glue
script with this new functionality, using the following command on a machine with 2 GPUs:I've only adjusted the glue example for now, will do the others once we have settled on an API.