updating librispeech recipe 2 #1727

LvHang · 2017-06-29T03:13:17Z

The PR carries on #1708 .
It moves the original "tdnn" recipes to a tuning-directory and adds the new recipes with "xconfigs".
Thank you for @jonlnichols 's xconfig tdnn recipe (local/chain/tuning/run_tdnn_1b.sh).

…/forum/embed/?place=forum/kaldi-help&showsearch=true&showpopout=true&parenturl=http%3A%2F%2Fkaldi-asr.org%2Fforums.html#!topic/kaldi-help/002wjyUFk9A

danpovey · 2017-06-29T03:17:21Z

Thanks. But if the 1c turns out to be better than 1b, I'd rather not include the 1b, to avoid adding bulk to the repository.
Also, see if you can create a script, e.g. local/chain/compare_wer.sh, to automatically create that table of WERs, so that in future we don't have to do it manually. There is one in the WSJ setup that you might be able to adapt for this purpose.

LvHang · 2017-06-29T03:23:58Z

Sure, I will create the script--compare_wer.sh.
When the experiment finish, I will inform you and deal with "1b" and "1c".

LvHang · 2017-07-03T20:48:42Z

Hi Dan,
The following is newest results.
"1a" means the original recipe. (The result comes from the "RESULT" file)
"1b" means jonlnichols's recipe with xconfigs.
"1c" and "1b" are very similar, but "1c" is one line fewer than "1b" [relu-batchnorm-layer name=tdnn2 dim=512 input=Append(-1,0,1)]
For "1d", I think its topo is same with "1a". So I think it is "xconfig" version "1a".
From the result form, I think the result of "1d" is worst. I suggest to rerun the "1a" recipe. I guess even though the data is same, there may be a difference between two gmm system, so that the alignment or other things has a little different.

System	1a	1b	1c	1d
dev_clean(fglarge)	3.87	3.84	3.89	4.01
dev_clean(tglarge)	3.97	4	4.09	4.16
dev_clean(tgmed)	4.95	5.16	5.17	5.18
dev_clean(tgsmall)	5.57	5.8	5.81	5.84
dev_other(fglarge)	10.22	10.15	10.24	10.51
dev_other(tglarge)	10.79	10.78	10.87	11.20
dev_other(tgmed)	13.01	13.09	13.23	13.54
dev_other(tgsmall)	14.36	14.61	14.63	15.14
test_celan(fglarge)	4.17	4.38	4.41	4.42
test_celan(tglarge)	4.36	4.58	4.54	4.61
test_celan(tgmed)	5.33	5.54	5.56	5.59
test_celan(tgsmall)	5.93	6.15	6.21	6.32
test_other(fglarge)	10.62	10.64	10.62	10.95
test_other(tglarge)	10.96	11.13	11.2	11.41
test_other(tgmed)	13.24	13.45	13.64	13.87
test_other(tgsmall)	14.53	14.92	15	15.19

Hang

danpovey · 2017-07-03T20:57:14Z

I see the problem. You are setting relu_dim=725, but the config just hardcodes the dimension to 512.
Try the 1c architecture with relu_dim really at 725 this time.
Also, change the frames_per_eg from 150 to 150,140,100.

LvHang · 2017-07-03T21:17:28Z

Oh, I see. Sorry for the mistake. jonlnichols uses it, so I copy it directly. It's my fault.
I will change the "frames_per_eg".
I checked the steps/nnet3/decode, the "--frames-per-chunk" option is 50 as default. The recipes of librispeech haven't set it, but the recipes of swbd set it to be the first value of "frames_per_egs". Need we also assign it?
Otherwise, I think from the result form, the "1b" is better than "1c". Why don't we try "1b"'s architecture?
Hang

danpovey · 2017-07-03T21:18:46Z

You don't have to change the --frames-per-chunk in decoding because TDNNs are not sensitive to that. I suggest to try 1c not 1b because it has one fewer layer. This will interact with adding more parameters.

…

On Mon, Jul 3, 2017 at 5:17 PM, LvHang ***@***.***> wrote: Oh, I see. Sorry for the mistake. jonlnichols uses it, so I copy it directly. It's my fault. I will change the "frames_per_eg". I checked the steps/nnet3/decode, the "--frames-per-chunk" option is 50 as default. The recipes of librispeech haven't set it, but the recipes of swbd set it to be the first value of "frames_per_egs". Need we also assign it? Otherwise, I think from the result form, the "1b" is better than "1c". Why don't we try "1b"'s architecture? Hang — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1727 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu26S3z3VVcNEF1h_1bNeU_JGNu4zks5sKVpqgaJpZM4OIxsY> .

LvHang · 2017-07-05T21:26:32Z

System	1a	1b	1c	1d	1e
dev_clean(fglarge)	3.87	3.84	3.89	4.01	3.9
dev_clean(tglarge)	3.97	4	4.09	4.16	4.05
dev_clean(tgmed)	4.95	5.16	5.17	5.18	5.02
dev_clean(tgsmall)	5.57	5.8	5.81	5.84	5.64
dev_other(fglarge)	10.22	10.15	10.24	10.51	10.29
dev_other(tglarge)	10.79	10.78	10.87	11.20	10.88
dev_other(tgmed)	13.01	13.09	13.23	13.54	13.3
dev_other(tgsmall)	14.36	14.61	14.63	15.14	14.78
test_celan(fglarge)	4.17	4.38	4.41	4.42	4.28
test_celan(tglarge)	4.36	4.58	4.54	4.61	4.42
test_celan(tgmed)	5.33	5.54	5.56	5.59	5.46
test_celan(tgsmall)	5.93	6.15	6.21	6.32	6.05
test_other(fglarge)	10.62	10.64	10.62	10.95	10.9
test_other(tglarge)	10.96	11.13	11.2	11.41	11.45
test_other(tgmed)	13.24	13.45	13.64	13.87	13.82
test_other(tgsmall)	14.53	14.92	15	15.19	15.08

Hi Dan,
This is the new results about "1e" which change from "1c" and set "reludim=725", "frames_per_eg=150,140,100".
Comparing the results, I found that "1e" is a little better than "1c" in "dev_clean" dataset, but in other data_sets, it looks worse. At the same time, it still worse than "1a".

LvHang · 2017-07-09T00:35:48Z

System	1a	1b	1c	1d	1e	1f
dev_clean(fglarge)	3.87	3.84	3.89	4.01	3.9	3.87
dev_clean(tglarge)	3.97	4	4.09	4.16	4.05	3.99
dev_clean(tgmed)	4.95	5.16	5.17	5.18	5.02	4.96
dev_clean(tgsmall)	5.57	5.8	5.81	5.84	5.64	5.42
dev_other(fglarge)	10.22	10.15	10.24	10.51	10.29	10.15
dev_other(tglarge)	10.79	10.78	10.87	11.20	10.88	10.77
dev_other(tgmed)	13.01	13.09	13.23	13.54	13.3	12.94
dev_other(tgsmall)	14.36	14.61	14.63	15.14	14.78	14.39
test_celan(fglarge)	4.17	4.38	4.41	4.42	4.28	4.14
test_celan(tglarge)	4.36	4.58	4.54	4.61	4.42	4.32
test_celan(tgmed)	5.33	5.54	5.56	5.59	5.46	5.28
test_celan(tgsmall)	5.93	6.15	6.21	6.32	6.05	5.88
test_other(fglarge)	10.62	10.64	10.62	10.95	10.9	10.80
test_other(tglarge)	10.96	11.13	11.2	11.41	11.45	11.13
test_other(tgmed)	13.24	13.45	13.64	13.87	13.82	13.37
test_other(tgsmall)	14.53	14.92	15	15.19	15.08	14.92

Hi Dan,
The above is the newest results.
For you convenience, I briefly conclude the experiments.
"1a" means the original recipe. (The result comes from the "RESULT" file)
"1b" means jonlnichols's recipe with xconfigs.
"1c" and "1b" are very similar, but "1c" is one line fewer than "1b" [relu-batchnorm-layer name=tdnn2 dim=512 input=Append(-1,0,1)]
"1d"'s topo looks like "1a" except with "reludim=512".
"1e" changes from "1c" and set "reludim=725", "frames_per_eg=150,140,100".
"1f"'s tdnn construction looks like "1a" with xconfig except the "frames_per_eg=150,140,100".

Except the "test_other" dataset, I think "1f" is slightly better than "1a".
Do you have any suggestion?

Bests,
Hang

danpovey · 2017-07-09T01:44:03Z

Great work! Please rename 1f to 1b and check it in.

…

On Sat, Jul 8, 2017 at 8:35 PM, LvHang ***@***.***> wrote: System 1a 1b 1c 1d 1e 1f dev_clean(fglarge) 3.87 3.84 3.89 4.01 3.9 3.87 dev_clean(tglarge) 3.97 4 4.09 4.16 4.05 3.99 dev_clean(tgmed) 4.95 5.16 5.17 5.18 5.02 4.96 dev_clean(tgsmall) 5.57 5.8 5.81 5.84 5.64 5.42 dev_other(fglarge) 10.22 10.15 10.24 10.51 10.29 10.15 dev_other(tglarge) 10.79 10.78 10.87 11.20 10.88 10.77 dev_other(tgmed) 13.01 13.09 13.23 13.54 13.3 12.94 dev_other(tgsmall) 14.36 14.61 14.63 15.14 14.78 14.39 test_celan(fglarge) 4.17 4.38 4.41 4.42 4.28 4.14 test_celan(tglarge) 4.36 4.58 4.54 4.61 4.42 4.32 test_celan(tgmed) 5.33 5.54 5.56 5.59 5.46 5.28 test_celan(tgsmall) 5.93 6.15 6.21 6.32 6.05 5.88 test_other(fglarge) 10.62 10.64 10.62 10.95 10.9 10.80 test_other(tglarge) 10.96 11.13 11.2 11.41 11.45 11.13 test_other(tgmed) 13.24 13.45 13.64 13.87 13.82 13.37 test_other(tgsmall) 14.53 14.92 15 15.19 15.08 14.92 Hi Dan, The above is the newest results. For you convenience, I briefly conclude the experiments. "1a" means the original recipe. (The result comes from the "RESULT" file) "1b" means jonlnichols's recipe with xconfigs. "1c" and "1b" are very similar, but "1c" is one line fewer than "1b" [relu-batchnorm-layer name=tdnn2 dim=512 input=Append(-1,0,1)] "1d"'s topo looks like "1a" except with "reludim=512". "1e" changes from "1c" and set "reludim=725", "frames_per_eg=150,140,100". "1f"'s tdnn construction looks like "1a" with xconfig except the "frames_per_eg=150,140,100". Except the "test_other" dataset, I think "1f" is slightly better than "1a". Do you have any suggestion? Bests, Hang — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1727 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVuwKUthgqGJSpww0IHRHTLx_gYwL9ks5sMCBogaJpZM4OIxsY> .

LvHang · 2017-07-09T02:53:07Z

Ok. I removed the unnecessary recipes.
Now the local/chain/tuning/run_tdnn_1a.sh means the original recipe.
And the local/chain/tuning/run_tdnn_1b.sh means the xconfigs recipe whose topo is similar with "1a" except the "frames_per_eg=150,140,100".
Otherwise, the "local/chain/compare_wer.sh" can be use to get the results easily.

* 'master' of https://github.com/kaldi-asr/kaldi: (36 commits) [scripts] Fix convert_nnet2_to_nnet3.py (kaldi-asr#1774) [egs] Add missing make_corpus_subset.sh in babel_multilang example (kaldi-asr#1766) [egs] Graphemic lexicon updates / fixes in babel/s5d recipe and hub4_spanish recipe (kaldi-asr#1740) [egs] update hkust results (kaldi-asr#1772) [egs] Update AMI chain experiments RE dropout, decay-time and proportional-shrink (kaldi-asr#1732) [egs] Fixes to the aishell (Mandarin) recipe (kaldi-asr#1770) [egs] Add recipe for aishell data (free Mandarin corpus, 170 hours total) (kaldi-asr#1742) [src] Change to arpa-reading code to accept blank lines with whitespace (kaldi-asr#1752) [scripts] For nnet3 training, add option to disable the model-combination (kaldi-asr#1757) [scripts] minor bugfix to nnet1 alignment script when creating lattices (kaldi-asr#1764) [src] Add support for row/column ranges when reading GeneralMatrix (kaldi-asr#1761) [src] Change name of option --norm-mean->--norm-means for consistency, thanks: 415198468@qq.com [egs] swbd/s5c, added 5 layer (b)lstm recipes (kaldi-asr#1759) [scripts] Fix bug in segment_long_utterances.sh (kaldi-asr#1758) [src] Fix indexing error in nnet1::Convolutional2DComponent (kaldi-asr#1755) [src] Fix usage message of program (thanks:jubang0219@gmail.com) [egs] some small updates to scripts (installing beamformit; segmentation example) [egs] Small fix to ami/s5b/local/chain/compare_wer_general.sh (kaldi-asr#1751) [build] Add configuration check for incompatible g++ compilers when CUDA is enabled. (kaldi-asr#1749) [egs] Update Librispeech nnet3 TDNN recipe (old one did not run) (kaldi-asr#1727) ...

…di-asr#1727)

jonlnichols and others added 3 commits June 23, 2017 05:46

updating librispeech recipe with fixes from https://groups.google.com…

658afb1

…/forum/embed/?place=forum/kaldi-help&showsearch=true&showpopout=true&parenturl=http%3A%2F%2Fkaldi-asr.org%2Fforums.html#!topic/kaldi-help/002wjyUFk9A

Merge branch 'librispeechfix' of https://github.com/jonlnichols/kaldi

7281f68

updating librispeech recipes

963dd07

Hang Lyu and others added 3 commits June 29, 2017 19:40

add compare_wer.sh

23b4570

add the result of chain tdnn_1c model

89f1faa

update local/chain/tuning/run_tdnn_1d

bc36595

Hang Lyu added 2 commits July 3, 2017 18:04

add run_tdnn_1e with reludim=725 and frames_per_eg=150,140,100

fe0af84

fix compare_wer.sh and update run_tdnn_1e.sh

5b8da2b

add run_tdnn_1f.sh which looks like 1a with xconfig

1c4a329

remove unnecessary test recipe.

ab05401

danpovey merged commit 39c6dde into kaldi-asr:master Jul 9, 2017

danpovey mentioned this pull request Aug 5, 2017

updating librispeech recipe #1708

Closed

Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018

[egs] Update Librispeech nnet3 TDNN recipe (old one did not run) (kal…

c07b40a

…di-asr#1727)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

updating librispeech recipe 2 #1727

updating librispeech recipe 2 #1727

LvHang commented Jun 29, 2017 •

edited

Loading

danpovey commented Jun 29, 2017

LvHang commented Jun 29, 2017

LvHang commented Jul 3, 2017

danpovey commented Jul 3, 2017

LvHang commented Jul 3, 2017

danpovey commented Jul 3, 2017 via email

LvHang commented Jul 5, 2017

LvHang commented Jul 9, 2017

danpovey commented Jul 9, 2017 via email

LvHang commented Jul 9, 2017

updating librispeech recipe 2 #1727

updating librispeech recipe 2 #1727

Conversation

LvHang commented Jun 29, 2017 • edited Loading

danpovey commented Jun 29, 2017

LvHang commented Jun 29, 2017

LvHang commented Jul 3, 2017

danpovey commented Jul 3, 2017

LvHang commented Jul 3, 2017

danpovey commented Jul 3, 2017 via email

LvHang commented Jul 5, 2017

LvHang commented Jul 9, 2017

danpovey commented Jul 9, 2017 via email

LvHang commented Jul 9, 2017

LvHang commented Jun 29, 2017 •

edited

Loading