-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updating librispeech recipe 2 #1727
Conversation
Thanks. But if the 1c turns out to be better than 1b, I'd rather not include the 1b, to avoid adding bulk to the repository. |
Sure, I will create the script--compare_wer.sh. |
Hi Dan,
Hang |
I see the problem. You are setting relu_dim=725, but the config just hardcodes the dimension to 512. |
Oh, I see. Sorry for the mistake. jonlnichols uses it, so I copy it directly. It's my fault. |
You don't have to change the --frames-per-chunk in decoding because TDNNs
are not sensitive to that.
I suggest to try 1c not 1b because it has one fewer layer. This will
interact with adding more parameters.
…On Mon, Jul 3, 2017 at 5:17 PM, LvHang ***@***.***> wrote:
Oh, I see. Sorry for the mistake. jonlnichols uses it, so I copy it
directly. It's my fault.
I will change the "frames_per_eg".
I checked the steps/nnet3/decode, the "--frames-per-chunk" option is 50 as
default. The recipes of librispeech haven't set it, but the recipes of swbd
set it to be the first value of "frames_per_egs". Need we also assign it?
Otherwise, I think from the result form, the "1b" is better than "1c". Why
don't we try "1b"'s architecture?
Hang
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1727 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu26S3z3VVcNEF1h_1bNeU_JGNu4zks5sKVpqgaJpZM4OIxsY>
.
|
Hi Dan, |
Hi Dan, Except the "test_other" dataset, I think "1f" is slightly better than "1a". Bests, |
Great work!
Please rename 1f to 1b and check it in.
…On Sat, Jul 8, 2017 at 8:35 PM, LvHang ***@***.***> wrote:
System 1a 1b 1c 1d 1e 1f
dev_clean(fglarge) 3.87 3.84 3.89 4.01 3.9 3.87
dev_clean(tglarge) 3.97 4 4.09 4.16 4.05 3.99
dev_clean(tgmed) 4.95 5.16 5.17 5.18 5.02 4.96
dev_clean(tgsmall) 5.57 5.8 5.81 5.84 5.64 5.42
dev_other(fglarge) 10.22 10.15 10.24 10.51 10.29 10.15
dev_other(tglarge) 10.79 10.78 10.87 11.20 10.88 10.77
dev_other(tgmed) 13.01 13.09 13.23 13.54 13.3 12.94
dev_other(tgsmall) 14.36 14.61 14.63 15.14 14.78 14.39
test_celan(fglarge) 4.17 4.38 4.41 4.42 4.28 4.14
test_celan(tglarge) 4.36 4.58 4.54 4.61 4.42 4.32
test_celan(tgmed) 5.33 5.54 5.56 5.59 5.46 5.28
test_celan(tgsmall) 5.93 6.15 6.21 6.32 6.05 5.88
test_other(fglarge) 10.62 10.64 10.62 10.95 10.9 10.80
test_other(tglarge) 10.96 11.13 11.2 11.41 11.45 11.13
test_other(tgmed) 13.24 13.45 13.64 13.87 13.82 13.37
test_other(tgsmall) 14.53 14.92 15 15.19 15.08 14.92
Hi Dan,
The above is the newest results.
For you convenience, I briefly conclude the experiments.
"1a" means the original recipe. (The result comes from the "RESULT" file)
"1b" means jonlnichols's recipe with xconfigs.
"1c" and "1b" are very similar, but "1c" is one line fewer than "1b"
[relu-batchnorm-layer name=tdnn2 dim=512 input=Append(-1,0,1)]
"1d"'s topo looks like "1a" except with "reludim=512".
"1e" changes from "1c" and set "reludim=725", "frames_per_eg=150,140,100".
"1f"'s tdnn construction looks like "1a" with xconfig except the
"frames_per_eg=150,140,100".
Except the "test_other" dataset, I think "1f" is slightly better than "1a".
Do you have any suggestion?
Bests,
Hang
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1727 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVuwKUthgqGJSpww0IHRHTLx_gYwL9ks5sMCBogaJpZM4OIxsY>
.
|
Ok. I removed the unnecessary recipes. |
* 'master' of https://github.com/kaldi-asr/kaldi: (36 commits) [scripts] Fix convert_nnet2_to_nnet3.py (kaldi-asr#1774) [egs] Add missing make_corpus_subset.sh in babel_multilang example (kaldi-asr#1766) [egs] Graphemic lexicon updates / fixes in babel/s5d recipe and hub4_spanish recipe (kaldi-asr#1740) [egs] update hkust results (kaldi-asr#1772) [egs] Update AMI chain experiments RE dropout, decay-time and proportional-shrink (kaldi-asr#1732) [egs] Fixes to the aishell (Mandarin) recipe (kaldi-asr#1770) [egs] Add recipe for aishell data (free Mandarin corpus, 170 hours total) (kaldi-asr#1742) [src] Change to arpa-reading code to accept blank lines with whitespace (kaldi-asr#1752) [scripts] For nnet3 training, add option to disable the model-combination (kaldi-asr#1757) [scripts] minor bugfix to nnet1 alignment script when creating lattices (kaldi-asr#1764) [src] Add support for row/column ranges when reading GeneralMatrix (kaldi-asr#1761) [src] Change name of option --norm-mean->--norm-means for consistency, thanks: 415198468@qq.com [egs] swbd/s5c, added 5 layer (b)lstm recipes (kaldi-asr#1759) [scripts] Fix bug in segment_long_utterances.sh (kaldi-asr#1758) [src] Fix indexing error in nnet1::Convolutional2DComponent (kaldi-asr#1755) [src] Fix usage message of program (thanks:jubang0219@gmail.com) [egs] some small updates to scripts (installing beamformit; segmentation example) [egs] Small fix to ami/s5b/local/chain/compare_wer_general.sh (kaldi-asr#1751) [build] Add configuration check for incompatible g++ compilers when CUDA is enabled. (kaldi-asr#1749) [egs] Update Librispeech nnet3 TDNN recipe (old one did not run) (kaldi-asr#1727) ...
The PR carries on #1708 .
It moves the original "tdnn" recipes to a tuning-directory and adds the new recipes with "xconfigs".
Thank you for @jonlnichols 's xconfig tdnn recipe (local/chain/tuning/run_tdnn_1b.sh).