- In conformer-based experiments,
nn.BatchNorm1d
was not used inConvolutionModule
, which made the training more stable. - To manually remove
nn.BatchNorm1d
, please modify this file:espnet/nets/pytorch_backend/conformer/convolution.py
- Comment out the following line in
__init__
:self.norm = nn.BatchNorm1d(channels)
- Modify 1D depthwise convolution in
forward
as follows:# 1D Depthwise Conv x = self.depthwise_conv(x) # x = self.activation(self.norm(x)) x = self.activation(x)
- Comment out the following line in
Google Speech Commands Paper: https://arxiv.org/abs/1804.03209
Two versions are supported in this recipe: 12 commands and 35 commands. The variable num_commands
in run.sh
should be set to 12 or 35.
- 12 commands: 10 words + silence + unknown. Results on two test sets are reported: (1) (test) a standard test set from the original paper, and (2) (test_speechbrain) a test set used in SpeechBrain's recipe.
- 35 commands: entire 35 command words. The entire test set from the original paper is used.
Model: https://zenodo.org/record/5635530#.YcaCZBOZMVU
- date:
Sun Oct 3 05:20:21 UTC 2021
- python version:
3.8.12 | packaged by conda-forge | (default, Sep 16 2021, 02:08:29) [GCC 9.4.0]
- espnet version:
espnet 0.10.3a3
- pytorch version:
pytorch 1.9.0
- Git hash:
8536be6afc363bcf6b4fc6f41d612e42173de46c
- Commit date:
Sun Oct 3 04:15:48 2021 +0000
- Commit date:
dataset | total | correct | accuracy |
---|---|---|---|
dev | 4605 | 4499 | 0.9770 |
test | 4890 | 4785 | 0.9785 |
test_speechbrain | 4886 | 4809 | 0.9842 |
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
infer/dev | 4605 | 4605 | 97.7 | 2.3 | 0.0 | 0.0 | 2.3 | 2.3 |
infer/test | 4890 | 4890 | 97.9 | 2.1 | 0.0 | 0.0 | 2.1 | 2.1 |
infer/test_speechbrain | 4886 | 4886 | 98.4 | 1.6 | 0.0 | 0.0 | 1.6 | 1.6 |
Model: https://zenodo.org/record/5637586#.YcaCQhOZMVU
- date:
Mon Oct 4 20:07:28 UTC 2021
- python version:
3.8.12 | packaged by conda-forge | (default, Sep 16 2021, 02:08:29) [GCC 9.4.0]
- espnet version:
espnet 0.10.3a3
- pytorch version:
pytorch 1.9.0
- Git hash:
94a64d4037602b2a7944619075bbc04ebdcd963d
- Commit date:
Sun Oct 3 04:24:10 2021 +0000
- Commit date:
dataset | total | correct | accuracy |
---|---|---|---|
dev | 9981 | 9725 | 0.9744 |
test | 11005 | 10732 | 0.9752 |
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
infer/dev | 9981 | 9981 | 97.4 | 2.6 | 0.0 | 0.0 | 2.6 | 2.6 |
infer/test | 11005 | 11005 | 97.5 | 2.5 | 0.0 | 0.0 | 2.5 | 2.5 |