Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matthew's Correlation Coefficient (MCC) always zero #492

Closed
mirix opened this issue Sep 6, 2023 · 1 comment
Closed

Matthew's Correlation Coefficient (MCC) always zero #492

mirix opened this issue Sep 6, 2023 · 1 comment

Comments

@mirix
Copy link

mirix commented Sep 6, 2023

Hello,

I am fine-tuning a multiclass audio classifier by using the template script with two GPUs:

https://github.com/huggingface/transformers/blob/main/examples/pytorch/audio-classification/README.md

I have seven imbalanced classes.

The model learns, in a few epochs the loss drops from ca. 2 to ca. 1 and F1 weighted goes from 0.26 to ca. 0.56.

Accuracy values are initially higher than F1 (it starts at 0.46), but also improve significantly during training.

However, Matthew's Correlation Coefficient (MCC) stays always at 0.0 during the training.

I have changed the command line to:

--metric_for_best_model matthews_correlation \

And the script to:

metric = evaluate.load("matthews_correlation") 

...

    def compute_metrics(eval_pred):
        """Computes accuracy on a batch of predictions"""
        predictions = np.argmax(eval_pred.predictions, axis=1)
        return metric.compute(predictions=predictions, references=eval_pred.label_ids, average='macro')

The average argument makes no difference, I have tried both macro and none (as well as removing it).

Am I doing anything wrong, is this behavior to be expected or is it a bug?

Best,

Ed

@mirix
Copy link
Author

mirix commented Sep 7, 2023

It was not zero. It was a problem with rounding. When there is only one metric, the trainer seems to report one decimal number (0.0), however, when I load several metrics and choose MCC as the one for evaluation, the trainer reports several decimal places for each metric.

@mirix mirix closed this as completed Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant