Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics for multilabel problems don't match the expected format. #585

Closed
adamamer20 opened this issue Apr 26, 2024 · 2 comments
Closed

Metrics for multilabel problems don't match the expected format. #585

adamamer20 opened this issue Apr 26, 2024 · 2 comments

Comments

@adamamer20
Copy link

Issue

Evaluation metrics cannot be used for multilabel classification problems.

Reproducible example

You can find a reproducible snippet here

Problem explanation

The error is given by how the expected format of some metrics has been chosen.
For example, for accuracy and f1("average"), f1("micro"), f1("macro"), the expected format is a scalar (Value(dtype='int32', id=None)) and thus breaksdown in a multilabel use (ValueError: Predictions and/or references don't match the expected format.).
Apart from the hassle of reshaping predictions and labels, and the confusion to define which indices correspond to the same label and which to the same instance, it's different from how it's done in other libraries. Scikit-learn accepts nested lists in the case of multilabel f1.

Possible solution

Refactor the format of EvaluationModule of accuracy and f1 (+ others...) to also accept Sequence

@shenxiangzhuang
Copy link

Hi @adamamer20 , did you try to use f1_metric = evaluate.load("f1", "multilabel")?

You question is similar with #550

@adamamer20
Copy link
Author

Thank you, it worked. I tried searching in the docs but there isn't anything on multilabel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants