RESULTS

Notes

In conformer-based experiments, nn.BatchNorm1d was not used in ConvolutionModule, which made the training more stable.

To manually remove nn.BatchNorm1d, please modify this file:

espnet/nets/pytorch_backend/conformer/convolution.py

Comment out the following line in __init__:
```
self.norm = nn.BatchNorm1d(channels)
```

Modify 1D depthwise convolution in forward as follows:

# 1D Depthwise Conv
x = self.depthwise_conv(x)
# x = self.activation(self.norm(x))
x = self.activation(x)

Dataset

Google Speech Commands Paper: https://arxiv.org/abs/1804.03209

Two versions are supported in this recipe: 12 commands and 35 commands. The variable num_commands in run.sh should be set to 12 or 35.

12 commands: 10 words + silence + unknown. Results on two test sets are reported: (1) (test) a standard test set from the original paper, and (2) (test_speechbrain) a test set used in SpeechBrain's recipe.
35 commands: entire 35 command words. The entire test set from the original paper is used.

asr_conformer_noBatchNorm_warmup5k_lr2e-4_accum3_conv15_5speeds (12 commands)

Model: https://zenodo.org/record/5635530#.YcaCZBOZMVU

Environments

date: Sun Oct 3 05:20:21 UTC 2021
python version: 3.8.12 | packaged by conda-forge | (default, Sep 16 2021, 02:08:29) [GCC 9.4.0]
espnet version: espnet 0.10.3a3
pytorch version: pytorch 1.9.0
Git hash: 8536be6afc363bcf6b4fc6f41d612e42173de46c
- Commit date: Sun Oct 3 04:15:48 2021 +0000

Classification Accuracy

dataset	total	correct	accuracy
dev	4605	4499	0.9770
test	4890	4785	0.9785
test_speechbrain	4886	4809	0.9842

WER

dataset	Snt	Wrd	Corr	Sub	Err	S.Err
infer/dev	4605	4605	97.7	2.3	2.3	2.3
infer/test	4890	4890	97.9	2.1	2.1	2.1
infer/test_speechbrain	4886	4886	98.4	1.6	1.6	1.6

asr_35commands_conformer_noBatchNorm_warmup5k_lr2e-4_accum3_conv15_5speeds (35 commands)

Model: https://zenodo.org/record/5637586#.YcaCQhOZMVU

Environments

date: Mon Oct 4 20:07:28 UTC 2021
python version: 3.8.12 | packaged by conda-forge | (default, Sep 16 2021, 02:08:29) [GCC 9.4.0]
espnet version: espnet 0.10.3a3
pytorch version: pytorch 1.9.0
Git hash: 94a64d4037602b2a7944619075bbc04ebdcd963d
- Commit date: Sun Oct 3 04:24:10 2021 +0000

Classification Accuracy

dataset	total	correct	accuracy
dev	9981	9725	0.9744
test	11005	10732	0.9752

WER

dataset	Snt	Wrd	Corr	Sub	Del	Ins	Err	S.Err
infer/dev	9981	9981	97.4	2.6	0.0	0.0	2.6	2.6
infer/test	11005	11005	97.5	2.5	0.0	0.0	2.5	2.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

RESULTS

Notes

Dataset

asr_conformer_noBatchNorm_warmup5k_lr2e-4_accum3_conv15_5speeds (12 commands)

Environments

Classification Accuracy

WER

asr_35commands_conformer_noBatchNorm_warmup5k_lr2e-4_accum3_conv15_5speeds (35 commands)

Environments

Classification Accuracy

WER

Files

README.md

Latest commit

History

README.md

File metadata and controls

RESULTS

Notes

Dataset

asr_conformer_noBatchNorm_warmup5k_lr2e-4_accum3_conv15_5speeds (12 commands)

Environments

Classification Accuracy

WER

asr_35commands_conformer_noBatchNorm_warmup5k_lr2e-4_accum3_conv15_5speeds (35 commands)

Environments

Classification Accuracy

WER