remove unused feature types in nnet3 scripts #1711

LvHang · 2017-06-24T04:11:33Z

Hi Dan,
I checked all the files in "egs//local/nnet3", "egs//local/chain" and "steps/nnet3" directories to make sure none of them set the "--feat-type" option.
I deleted the 'lda' branch in "steps/nnet3/ scripts" and fixed the codes about "transform_dir". Now, the scripts only execute the feat-transform about "raw".
But there are three exceptions.
(1)"steps/nnet3/chain/build_tree.sh" looks only support "delta" feature. (2)"egs/wsj/s5/steps/nnet3/get_degs.sh" and "steps/nnet3/make_denlats.sh" support "raw" and "delta".
In the comment of "steps/nnet3/make_denlats.sh", it said "you can set this in order to run on top of delta features, although we don't normally want to do this." So I keep it. I don't know if I should delete it.

Now, I'm testing it. When the testing finish, I will inform you.
Hang

danpovey

After addressing these comments, you can test it in the mini_librispeech setup.
If there is no discriminative-training example script there (and there probably isn't), please ask @hhadian to help you add one there, or @vimalmanohar if Hossein is not available and Vimal has time.

danpovey · 2017-06-24T04:15:40Z

egs/wsj/s5/steps/nnet3/chain/build_tree.sh

-
-if [ -f $alidir/final.mat ]; then feat_type=lda; else feat_type=delta; fi
-echo "$0: feature type is $feat_type"
+echo "$0: feature type is delta"


sorry-- you should actually leave this file as it was. it uses the features from a baseline GMM-based system, which will normally be LDA.

danpovey · 2017-06-24T04:16:42Z

egs/wsj/s5/steps/nnet3/decode.sh

-    echo "$0: LDA transforms differ between $srcdir and $transform_dir"
-    exit 1;
-  fi
+  trans=raw_trans;


no need to use a variable now since it can have only one value.

danpovey · 2017-06-24T04:16:59Z

egs/wsj/s5/steps/nnet3/decode_looped.sh

-    echo "$0: LDA transforms differ between $srcdir and $transform_dir"
-    exit 1;
-  fi
+  trans=raw_trans;


same here, remove the variable.

danpovey · 2017-06-24T04:18:39Z

egs/wsj/s5/steps/nnet3/get_degs.sh

@@ -13,7 +13,7 @@ cmd=run.pl
 max_copy_jobs=5  # Limit disk I/O

 # feature options
-feat_type=raw     # set it to 'lda' to use LDA features.
+feat_type=raw     # set it to 'delta' to use delta features.


disable delta features here; this only runs on top of already-trained chain system so if we remove delta in the basic chain training scripts (which you should), it has no reason to exist here.

danpovey · 2017-06-24T04:19:21Z

egs/wsj/s5/steps/nnet3/get_egs_targets.sh

-  fi
-fi
-if [ -f $transform_dir/raw_trans.1 ] && [ $feat_type == "raw" ]; then
+if [ -f $transform_dir/raw_trans.1 ]; then


@vimalmanohar, are we ever using get_egs_target.sh with other than raw features?

danpovey · 2017-06-24T04:20:05Z

egs/wsj/s5/steps/nnet3/lstm/train.sh

@@ -103,7 +103,6 @@ egs_opts=
 transform_dir=     # If supplied, this dir used instead of alidir to find transforms.
 cmvn_opts=  # will be passed to get_lda.sh and get_egs.sh, if supplied.


No need to change this script, it is deprecated so I don't want to waste time testing the change.

danpovey · 2017-06-24T04:21:02Z

egs/wsj/s5/steps/nnet3/make_denlats.sh

@@ -34,8 +34,8 @@ extra_left_context=0
 extra_right_context=0
 extra_left_context_initial=-1
 extra_right_context_final=-1
-feat_type=  # you can set this in order to run on top of delta features, although we don't
-            # normally want to do this.


No need to have this variable any more. This script doesn't need to support delta features or any features exept raw, and there is no need for a "feat_type" variable.

Weird, I updated it but it isn't changed here. But the file in "Files changed" tab is modified.

danpovey · 2017-06-24T04:21:27Z

egs/wsj/s5/steps/nnet3/tdnn/train.sh

@@ -71,7 +71,6 @@ egs_opts=
 transform_dir=     # If supplied, this dir used instead of alidir to find transforms.
 cmvn_opts=  # will be passed to get_lda.sh and get_egs.sh, if supplied.
            # only relevant for "raw" features, not lda.


you can revert changes to this script, the script is deprecated.

danpovey · 2017-06-24T04:21:51Z

egs/wsj/s5/steps/nnet3/train_tdnn.sh

@@ -70,7 +70,6 @@ egs_opts=
 transform_dir=     # If supplied, this dir used instead of alidir to find transforms.
 cmvn_opts=  # will be passed to get_lda.sh and get_egs.sh, if supplied.
            # only relevant for "raw" features, not lda.
-feat_type=raw  # or set to 'lda' to use LDA features.


you can also revert changes to this script.

LvHang · 2017-06-24T05:30:43Z

Hi Dan,
I have already addressed these comments.
Today, I can write a discriminative training example script refer to swbd's relevant script, then I will push it. After that, @vimalmanohar or @hhadian can help to review it. Thanks for your work.
And I will test it with mini_librispeech setup.
Hang

LvHang · 2017-06-25T05:17:44Z

@vimalmanohar @hhadian
I have added a discriminative recipe for mini_librispeech. Could you please help to review it when you are free. Thanks.
I'm testing it. I write it refer to the discriminative scripts of swbd and librispeech. I hope there is no essential error.
Hang

danpovey · 2017-06-25T05:26:23Z

egs/mini_librispeech/s5/local/chain/tuning/run_tdnn_1c_discriminative.sh

+frames_overlap_per_eg=30
+
+## Nnet training options
+effective_learning_rate=0.000001


if you copied this from a setup with more data, it may make sense to increase it.
but with this little data, you may not get any improvement at all so it may be hard to tune this value.

Yeah, I refer to some scripts for the "effective_learning_rate" option.
In tedlium, it is 1.2510^-6. In librispeech, it is 110^-6. In swbd, it is 1.25*10^-7.
But they are all much bigger than mini_librispeech.
I don't have enough experience for it. So do you have any suggestive value for this option?

danpovey · 2017-06-25T05:57:02Z

Try 10 times larger.

…

On Sun, Jun 25, 2017 at 1:51 AM, LvHang ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In egs/mini_librispeech/s5/local/chain/tuning/run_tdnn_1c_ discriminative.sh <#1711 (comment)>: > +tree_dir=exp/chain/tree_sp +degs_dir= # If provided, will skip the degs directory creation +lats_dir= # If provided, will skip denlats creation + +## Objective options +criterion=smbr +one_silence_class=true + +dir=${srcdir}_${criterion} + +## Egs options +frames_per_eg=150 +frames_overlap_per_eg=30 + +## Nnet training options +effective_learning_rate=0.000001 Yeah, I refer to some scripts for the "effective_learning_rate" option. In tedlium, it is 1.25*10^-6. In librispeech, it is 1*10^-6. In swbd, it is 1.25*10^-7. But they are all much bigger than mini_librispeech. I don't have enough experience for it. So do you have any suggestive value for this option? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1711 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu97I3ybtL-hJ-axnjRJXsRKomfebks5sHfVsgaJpZM4OEN1Z> .

LvHang · 2017-06-25T05:57:58Z

Ok, thanks.

LvHang · 2017-06-26T02:59:23Z

I check the steps/nnet3/train_discriminative.sh, it doesn't contain the "--adjust-priors" and "--modify-learning-rates" options.
But the discriminative scripts of swbd（swbd/s5c/local/chain/tuning/run_tdnn_6h_discriminative.sh） and librispeech（librispeech/s5/local/chain/run_tdnn_discriminative.sh ） still use them.
So I guess they are out-of-date, right?
Hang

danpovey · 2017-06-26T03:01:58Z

@vimalmanohar, any idea what is the problem here?

…

On Sun, Jun 25, 2017 at 10:59 PM, LvHang ***@***.***> wrote: I check the steps/nnet3/train_discriminative.sh, it doesn't contain the "--adjust-priors" and "--modify-learning-rates" options. But the discriminative scripts of swbd（swbd/s5c/local/chain/ tuning/run_tdnn_6h_discriminative.sh） and librispeech（librispeech/s5/ local/chain/run_tdnn_discriminative.sh ） still use them. So I guess they are out-of-date, right? Hang — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1711 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu8DhFVYsPobN4SpiVOIvDLO6drcjks5sHx6OgaJpZM4OEN1Z> .

vimalmanohar · 2017-06-26T03:12:28Z

adjust-priors was made true by default and the option is removed. Also modify-learning-rates is no longer required and is not supported anymore. On Sun, Jun 25, 2017 at 11:02 PM Daniel Povey <notifications@github.com> wrote:

@vimalmanohar, any idea what is the problem here? On Sun, Jun 25, 2017 at 10:59 PM, LvHang ***@***.***> wrote: > I check the steps/nnet3/train_discriminative.sh, it doesn't contain the > "--adjust-priors" and "--modify-learning-rates" options. > But the discriminative scripts of swbd（swbd/s5c/local/chain/ > tuning/run_tdnn_6h_discriminative.sh） and librispeech（librispeech/s5/ > local/chain/run_tdnn_discriminative.sh ） still use them. > So I guess they are out-of-date, right? > Hang > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#1711 (comment)>, or mute > the thread > < https://github.com/notifications/unsubscribe-auth/ADJVu8DhFVYsPobN4SpiVOIvDLO6drcjks5sHx6OgaJpZM4OEN1Z > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1711 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEATV2QbH1AzD2dxQ-vlwXADqc-kGT5lks5sHx8ugaJpZM4OEN1Z> .

-- Vimal Manohar PhD Student Electrical & Computer Engineering Johns Hopkins University

danpovey · 2017-06-26T03:35:04Z

OK, @hanglv, please search through all example scripts for the use of those options and remove them. (obviously excluding nnet2 scripts). On Sun, Jun 25, 2017 at 11:12 PM, Vimal Manohar <notifications@github.com> wrote:

…

adjust-priors was made true by default and the option is removed. Also modify-learning-rates is no longer required and is not supported anymore. On Sun, Jun 25, 2017 at 11:02 PM Daniel Povey ***@***.***> wrote: > @vimalmanohar, any idea what is the problem here? > > > On Sun, Jun 25, 2017 at 10:59 PM, LvHang ***@***.***> wrote: > > > I check the steps/nnet3/train_discriminative.sh, it doesn't contain the > > "--adjust-priors" and "--modify-learning-rates" options. > > But the discriminative scripts of swbd（swbd/s5c/local/chain/ > > tuning/run_tdnn_6h_discriminative.sh） and librispeech（librispeech/s5/ > > local/chain/run_tdnn_discriminative.sh ） still use them. > > So I guess they are out-of-date, right? > > Hang > > > > — > > You are receiving this because you commented. > > Reply to this email directly, view it on GitHub > > <#1711 (comment)>, > or mute > > the thread > > < > https://github.com/notifications/unsubscribe-auth/ ADJVu8DhFVYsPobN4SpiVOIvDLO6drcjks5sHx6OgaJpZM4OEN1Z > > > > . > > > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#1711 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AEATV2QbH1AzD2dxQ- vlwXADqc-kGT5lks5sHx8ugaJpZM4OEN1Z> > . > -- Vimal Manohar PhD Student Electrical & Computer Engineering Johns Hopkins University — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1711 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu0-z0xede2mjT92sOnEUn2heKyIAks5sHyGigaJpZM4OEN1Z> .

LvHang · 2017-06-26T03:37:32Z

Yes, i see.

LvHang · 2017-06-26T21:32:50Z

@danpovey @vimalmanohar
I think when we use steps/nnet3/decode.sh to decode the discriminative model, the script will call "steps/nnet2/check_ivectors_compatible.sh" to check the ivectors compatible between "ivector-dir" and "discriminative dir". But we have never copied the "final.ie.id" of the base model dir(eg. exp/chain/tdnn1c_sp). So that it always prompt warning, right?
Hang

danpovey · 2017-06-26T21:34:38Z

Probably the discriminative-training script should copy that-- please include that in your PR. But make sure the script doesn't fail if the final.ie.id is not present in the source directory. If the script does set -e at the top, then you may have to do something like cp $srcdir/final.ie.id $dir 2>/dev/null || true to suppress any error.

…

On Mon, Jun 26, 2017 at 5:32 PM, LvHang ***@***.***> wrote: @danpovey <https://github.com/danpovey> @vimalmanohar <https://github.com/vimalmanohar> I think when we use steps/nnet3/decode.sh to decode the discriminative model, the script will call "steps/nnet2/check_ivectors_compatible.sh" to check the ivectors compatible between "ivector-dir" and "discriminative dir". But we have never copied the "final.ie.id" of the base model dir(eg. exp/chain/tdnn1c_sp). So that it always prompt warning, right? Hang — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1711 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu9A2DyFN37GvUTI6REFCZSu8JBQDks5sICOHgaJpZM4OEN1Z> .

LvHang · 2017-06-26T23:56:05Z

@danpovey @vimalmanohar
I think there is also a mistake. Now the adjust-priors was made true by default. so that the "steps/nnet3/adjust_prior.sh" always will be called. This script will generate the ${dir}/${iter}_adj.mdl. However, in swbd corpus, when it run "steps/nnet3/decode.sh", the "--iter epoch$x.adj" will be set, that means it will use ${dir}/${iter}.adj.mdl model. Obviously, the name is wrong. I will fix it in my PR.
For a question, when we do decoding in discriminative script, which kind of model we always use by default, the "adj.mdl" or ".mdl"?
Hang

danpovey · 2017-06-27T00:14:23Z

I think you should try decoding both of them and see which is better. I don't have a lot of experience with the discriminative training and I don't recall which is normally better. Probably the adjusted one, but I'm not sure.

…

On Mon, Jun 26, 2017 at 7:56 PM, LvHang ***@***.***> wrote: @danpovey <https://github.com/danpovey> @vimalmanohar <https://github.com/vimalmanohar> I think there is also a mistake. Now the adjust-priors was made true by default. so that the "steps/nnet3/adjust_prior.sh" always will be called. This script will generate the ${dir}/${iter}_adj.mdl. However, in swbd corpus, when it run "steps/nnet3/decode.sh", the "--iter epoch$x.adj" will be set, that means it will use ${dir}/${iter}.adj.mdl model. Obviously, the name is wrong. I will fix it in my PR. For a question, when we do decoding in discriminative script, which kind of model we always use by default, the "adj.mdl" or ".mdl"? Hang — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1711 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu7uyR5AAdf1EPpxsi5U15jAj3tk4ks5sIEUZgaJpZM4OEN1Z> .

…or info

LvHang · 2017-06-27T19:09:46Z

Hi Dan,
This is the newest results about discriminative training. (offline-decoding)

	tglarge_dev_clean_2	tgsmall_dev_clean_2
baseline	10.85	14.9
epoch1	10.16	14.3
epoch1_adj	10.13	14.26
epoch2	*10.11	14.33
epoch2_adj	10.21	*14.23
epoch3	10.41	14.49
epoch3_adj	10.36	14.37
epoch4	10.53	14.56
epoch4_adj	10.5	14.49

The "_adj" means I use the "adjust_prior" model. The one without "_adj" means the normal model.
The numerical value with a asterisk in front of it means the best result.

Hang

danpovey · 2017-06-28T23:17:03Z

egs/wsj/s5/steps/nnet3/align.sh

-    echo "$0: LDA transforms differ between $srcdir and $transform_dir"
-    exit 1;
-  fi
+  trans=raw_trans;


please don't use a variable here, just replace $trans with raw_trans

LvHang · 2017-06-29T00:55:39Z

I have fixed it. Sorry for the carelessness.

danpovey · 2017-06-29T00:58:43Z

merging-- thanks.

… (LDA, delta); add sMBR recipe for mini-librispeech (kaldi-asr#1711)

Hang Lyu added 2 commits June 23, 2017 23:53

remove unused feature types in nnet3 scripts

d323b48

small fix about align.sh

6e474ad

danpovey reviewed Jun 24, 2017

View reviewed changes

fix about delta, replace trans variable and revert deprecated scripts

ab05726

add a discriminative recipe for mini_librispeech

fec1d13

danpovey reviewed Jun 25, 2017

View reviewed changes

Hang Lyu added 2 commits June 25, 2017 02:13

fix discriminative learning_rate

a341118

remove the --feat-type options in steps/libs/

02eaa44

Hang Lyu added 4 commits June 27, 2017 02:02

fix --adjust-priors, --modify-learning-rates, --iter format and ivect…

522bd50

…or info

small fix

a18adc2

fix type

db5ae9b

fix type2

4aede55

set the stage

03e655b

danpovey reviewed Jun 28, 2017

View reviewed changes

small fix

0b1b422

small fix

fd12427

danpovey merged commit 3505e86 into kaldi-asr:master Jun 29, 2017

danpovey mentioned this pull request Jul 10, 2017

Remove unused feature types in nnet3 scripts #1705

Closed

Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018

[scripts,egs] simplify nnet3 scripts by removing unused feature types…

fd6d0e0

… (LDA, delta); add sMBR recipe for mini-librispeech (kaldi-asr#1711)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove unused feature types in nnet3 scripts #1711

remove unused feature types in nnet3 scripts #1711

LvHang commented Jun 24, 2017

danpovey left a comment

danpovey Jun 24, 2017

danpovey Jun 24, 2017

danpovey Jun 24, 2017

danpovey Jun 24, 2017

danpovey Jun 24, 2017

danpovey Jun 24, 2017

danpovey Jun 24, 2017

LvHang Jun 24, 2017 •

edited

Loading

danpovey Jun 24, 2017

danpovey Jun 24, 2017

LvHang commented Jun 24, 2017

LvHang commented Jun 25, 2017

danpovey Jun 25, 2017

LvHang Jun 25, 2017

danpovey commented Jun 25, 2017 via email

LvHang commented Jun 25, 2017

LvHang commented Jun 26, 2017

danpovey commented Jun 26, 2017 via email

vimalmanohar commented Jun 26, 2017 via email

danpovey commented Jun 26, 2017 via email

LvHang commented Jun 26, 2017

LvHang commented Jun 26, 2017

danpovey commented Jun 26, 2017 via email

LvHang commented Jun 26, 2017

danpovey commented Jun 27, 2017 via email

LvHang commented Jun 27, 2017

danpovey Jun 28, 2017

LvHang commented Jun 29, 2017

danpovey commented Jun 29, 2017

		@@ -103,7 +103,6 @@ egs_opts=
		transform_dir= # If supplied, this dir used instead of alidir to find transforms.
		cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied.

remove unused feature types in nnet3 scripts #1711

remove unused feature types in nnet3 scripts #1711

Conversation

LvHang commented Jun 24, 2017

danpovey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LvHang Jun 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LvHang commented Jun 24, 2017

LvHang commented Jun 25, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Jun 25, 2017 via email

LvHang commented Jun 25, 2017

LvHang commented Jun 26, 2017

danpovey commented Jun 26, 2017 via email

vimalmanohar commented Jun 26, 2017 via email

danpovey commented Jun 26, 2017 via email

LvHang commented Jun 26, 2017

LvHang commented Jun 26, 2017

danpovey commented Jun 26, 2017 via email

LvHang commented Jun 26, 2017

danpovey commented Jun 27, 2017 via email

LvHang commented Jun 27, 2017

Choose a reason for hiding this comment

LvHang commented Jun 29, 2017

danpovey commented Jun 29, 2017

LvHang Jun 24, 2017 •

edited

Loading