TIMIT typically reports PER, not WER #94

pzelasko · 2022-05-31T15:32:34Z

The docs here mention that TIMIT reports WER, but this dataset typically serves as a benchmark for phone error rate (PER), because it’s one of the few resources that have manually annotated phone segments. I recommend to fix and clarify that in the README:

evaluate/metrics/wer/README.md

Line 68 in c1141b0

    
           For example, datasets such as [LibriSpeech](https://huggingface.co/datasets/librispeech_asr) report a WER in the 1.8-3.3 range, whereas ASR models evaluated on [Timit](https://huggingface.co/datasets/timit_asr) report a WER in the 8.3-20.4 range.

I think it would be good to have a clear difference between word/character/phone/token error rate (WER/CER/PER/TER) at the library level.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TIMIT typically reports PER, not WER #94

TIMIT typically reports PER, not WER #94

pzelasko commented May 31, 2022 •

edited

Loading

TIMIT typically reports PER, not WER #94

TIMIT typically reports PER, not WER #94

Comments

pzelasko commented May 31, 2022 • edited Loading

pzelasko commented May 31, 2022 •

edited

Loading