Feature: integration standard libraries #11

lvwerra · 2022-04-06T15:27:14Z

Create a uniform integration for metrics libraries such as scikit-learn, keras.metrics or torchmetrics instead of integrating them metrics from these frameworks one-by-one. The goal of the integration is to be able to load the metrics from these frameworks with a syntax as follows:

accuracy = load_metric("scikit-learn/accuracy")

This way there is also more transparency where the metrics come from especially if they are just a wrapper around existing libraries. This would also simplify maintaining them by avoiding the need to change every metric separately if needed.

Popular metrics libraries:

Others (maybe a bit more niche):

If all metrics (incl. "canonical" ones) will be on the hub such an integration could just be a script that automatically creates (or updates) the necessary repositories on the hub.

The text was updated successfully, but these errors were encountered:

sashavor · 2022-04-14T13:27:47Z

We could also connect to GEM as per @yjernite 's proposal in another thread.

sashavor · 2022-04-19T17:09:58Z

And to Skimage

sashavor · 2022-04-19T20:55:03Z

And to Torch Fidelity -- for generative metrics

thomwolf · 2022-04-26T13:48:06Z

I would rather reimplement these metrics than depend on other libs.

My heuristic:

if the metric is something stable we don't expect to change (Jaccard score): I would reimplement. Note that if the license of the other lib is permissive (apache 2 ou MIT) you can copy with copyright citation and it's fine
if the metric is expected to change, e.g. BertScore or other metrics that either very recent or based on pretrained neural net: I would rather depend on an external lib to avoid having to keep our metric in sync

lvwerra · 2022-04-26T15:16:03Z

So this would be quite a deviation from the current status of metrics: most of them are wrappers around existing libraies. Just to name a few:

"accuracy", "recall", "f1" are all wrappers around scikit-learn
"meteor" uses nltk
"seqeval" is a wrapper for seqeval
"spearmanr" comes from scipy

Would you reimplement all of them? As an example, there are O(100) metrics in scikit-learn and we only have a handful of them inside evaluate at the moment. scikit-learn has BSD licence which would also mean that we would need a BSD license as well if we copy-paste the code, right?

My main concern is that we can't move very fast and adopt a wide range of modalities if we implement most of them ourselves. I feel like a criteria for adoption is how fast we cover a wide range of metrics which avoids a user to leave the library for a missing one. On the other hand with community metrics users would probably quickly add missing metrics by creating wrappers since that's the fastest way (which is probably what kind of happened in datasets/metrics.

Regarding dependencies: the idea was that each metric's repository has a requirement.txt file, so we don't need to add the dependencies to evaluate but being able to easily guide the user to update the installation.

What do you think about the following:

Phase 1: allow 3rd party integrations slowly wrappers from evaluate/metrics
Phase 2: add proper implementations to evaluate/metrics where there is a need

johnnv1 · 2022-06-16T14:47:53Z

Just giving my half-cent opinion here… if you are going to rewrite the metrics, it could be done in a way that you compute the metrics from a confusion matrix. I know this could be done for metrics used in computer vision, I don't know if the same would be true for NLP.

But as already noted, this would probably take more time than third-party integrations 😅

lvwerra added documentation Improvements or additions to documentation enhancement New feature or request and removed documentation Improvements or additions to documentation labels Apr 6, 2022

lvwerra mentioned this issue Apr 25, 2022

Add missing metrics #17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: integration standard libraries #11

Feature: integration standard libraries #11

lvwerra commented Apr 6, 2022

sashavor commented Apr 14, 2022

sashavor commented Apr 19, 2022 •

edited

Loading

sashavor commented Apr 19, 2022

thomwolf commented Apr 26, 2022 •

edited

Loading

lvwerra commented Apr 26, 2022

johnnv1 commented Jun 16, 2022

Feature: integration standard libraries #11

Feature: integration standard libraries #11

Comments

lvwerra commented Apr 6, 2022

sashavor commented Apr 14, 2022

sashavor commented Apr 19, 2022 • edited Loading

sashavor commented Apr 19, 2022

thomwolf commented Apr 26, 2022 • edited Loading

lvwerra commented Apr 26, 2022

johnnv1 commented Jun 16, 2022

sashavor commented Apr 19, 2022 •

edited

Loading

thomwolf commented Apr 26, 2022 •

edited

Loading