Skip to content

Commit

Permalink
📝 update metric with evaluate (#18535)
Browse files Browse the repository at this point in the history
  • Loading branch information
stevhliu authored Aug 9, 2022
1 parent 9f5fe63 commit 0c183cc
Showing 1 changed file with 10 additions and 8 deletions.
18 changes: 10 additions & 8 deletions docs/source/en/training.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -98,18 +98,18 @@ Specify where to save the checkpoints from your training:
>>> training_args = TrainingArguments(output_dir="test_trainer")
```

### Metrics
### Evaluate

[`Trainer`] does not automatically evaluate model performance during training. You will need to pass [`Trainer`] a function to compute and report metrics. The 🤗 Datasets library provides a simple [`accuracy`](https://huggingface.co/metrics/accuracy) function you can load with the `load_metric` (see this [tutorial](https://huggingface.co/docs/datasets/metrics.html) for more information) function:
[`Trainer`] does not automatically evaluate model performance during training. You'll need to pass [`Trainer`] a function to compute and report metrics. The [🤗 Evaluate](https://huggingface.co/docs/evaluate/index) library provides a simple [`accuracy`](https://huggingface.co/spaces/evaluate-metric/accuracy) function you can load with the [`evaluate.load`] (see this [quicktour](https://huggingface.co/docs/evaluate/a_quick_tour) for more information) function:

```py
>>> import numpy as np
>>> from datasets import load_metric
>>> import evaluate

>>> metric = load_metric("accuracy")
>>> metric = evaluate.load("accuracy")
```

Call `compute` on `metric` to calculate the accuracy of your predictions. Before passing your predictions to `compute`, you need to convert the predictions to logits (remember all 🤗 Transformers models return logits):
Call [`~evaluate.compute`] on `metric` to calculate the accuracy of your predictions. Before passing your predictions to `compute`, you need to convert the predictions to logits (remember all 🤗 Transformers models return logits):

```py
>>> def compute_metrics(eval_pred):
Expand Down Expand Up @@ -341,12 +341,14 @@ To keep track of your training progress, use the [tqdm](https://tqdm.github.io/)
... progress_bar.update(1)
```

### Metrics
### Evaluate

Just like how you need to add an evaluation function to [`Trainer`], you need to do the same when you write your own training loop. But instead of calculating and reporting the metric at the end of each epoch, this time you will accumulate all the batches with [`add_batch`](https://huggingface.co/docs/datasets/package_reference/main_classes.html?highlight=add_batch#datasets.Metric.add_batch) and calculate the metric at the very end.
Just like how you added an evaluation function to [`Trainer`], you need to do the same when you write your own training loop. But instead of calculating and reporting the metric at the end of each epoch, this time you'll accumulate all the batches with [`~evaluate.add_batch`] and calculate the metric at the very end.

```py
>>> metric = load_metric("accuracy")
>>> import evaluate

>>> metric = evaluate.load("accuracy")
>>> model.eval()
>>> for batch in eval_dataloader:
... batch = {k: v.to(device) for k, v in batch.items()}
Expand Down

0 comments on commit 0c183cc

Please sign in to comment.