Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove unused feature types in nnet3 scripts #1711

Merged
merged 13 commits into from
Jun 29, 2017

Conversation

LvHang
Copy link
Contributor

@LvHang LvHang commented Jun 24, 2017

Hi Dan,
I checked all the files in "egs//local/nnet3", "egs//local/chain" and "steps/nnet3" directories to make sure none of them set the "--feat-type" option.
I deleted the 'lda' branch in "steps/nnet3/ scripts" and fixed the codes about "transform_dir". Now, the scripts only execute the feat-transform about "raw".
But there are three exceptions.
(1)"steps/nnet3/chain/build_tree.sh" looks only support "delta" feature. (2)"egs/wsj/s5/steps/nnet3/get_degs.sh" and "steps/nnet3/make_denlats.sh" support "raw" and "delta".
In the comment of "steps/nnet3/make_denlats.sh", it said "you can set this in order to run on top of delta features, although we don't normally want to do this." So I keep it. I don't know if I should delete it.

Now, I'm testing it. When the testing finish, I will inform you.
Hang

Copy link
Contributor

@danpovey danpovey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After addressing these comments, you can test it in the mini_librispeech setup.
If there is no discriminative-training example script there (and there probably isn't), please ask @hhadian to help you add one there, or @vimalmanohar if Hossein is not available and Vimal has time.


if [ -f $alidir/final.mat ]; then feat_type=lda; else feat_type=delta; fi
echo "$0: feature type is $feat_type"
echo "$0: feature type is delta"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry-- you should actually leave this file as it was. it uses the features from a baseline GMM-based system, which will normally be LDA.

echo "$0: LDA transforms differ between $srcdir and $transform_dir"
exit 1;
fi
trans=raw_trans;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to use a variable now since it can have only one value.

echo "$0: LDA transforms differ between $srcdir and $transform_dir"
exit 1;
fi
trans=raw_trans;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, remove the variable.

@@ -13,7 +13,7 @@ cmd=run.pl
max_copy_jobs=5 # Limit disk I/O

# feature options
feat_type=raw # set it to 'lda' to use LDA features.
feat_type=raw # set it to 'delta' to use delta features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable delta features here; this only runs on top of already-trained chain system so if we remove delta in the basic chain training scripts (which you should), it has no reason to exist here.

fi
fi
if [ -f $transform_dir/raw_trans.1 ] && [ $feat_type == "raw" ]; then
if [ -f $transform_dir/raw_trans.1 ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vimalmanohar, are we ever using get_egs_target.sh with other than raw features?

@@ -103,7 +103,6 @@ egs_opts=
transform_dir= # If supplied, this dir used instead of alidir to find transforms.
cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to change this script, it is deprecated so I don't want to waste time testing the change.

@@ -34,8 +34,8 @@ extra_left_context=0
extra_right_context=0
extra_left_context_initial=-1
extra_right_context_final=-1
feat_type= # you can set this in order to run on top of delta features, although we don't
# normally want to do this.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to have this variable any more. This script doesn't need to support delta features or any features exept raw, and there is no need for a "feat_type" variable.

Copy link
Contributor Author

@LvHang LvHang Jun 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird, I updated it but it isn't changed here. But the file in "Files changed" tab is modified.

@@ -71,7 +71,6 @@ egs_opts=
transform_dir= # If supplied, this dir used instead of alidir to find transforms.
cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied.
# only relevant for "raw" features, not lda.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can revert changes to this script, the script is deprecated.

@@ -70,7 +70,6 @@ egs_opts=
transform_dir= # If supplied, this dir used instead of alidir to find transforms.
cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied.
# only relevant for "raw" features, not lda.
feat_type=raw # or set to 'lda' to use LDA features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also revert changes to this script.

@LvHang
Copy link
Contributor Author

LvHang commented Jun 24, 2017

Hi Dan,
I have already addressed these comments.
Today, I can write a discriminative training example script refer to swbd's relevant script, then I will push it. After that, @vimalmanohar or @hhadian can help to review it. Thanks for your work.
And I will test it with mini_librispeech setup.
Hang

@LvHang
Copy link
Contributor Author

LvHang commented Jun 25, 2017

@vimalmanohar @hhadian
I have added a discriminative recipe for mini_librispeech. Could you please help to review it when you are free. Thanks.
I'm testing it. I write it refer to the discriminative scripts of swbd and librispeech. I hope there is no essential error.
Hang

frames_overlap_per_eg=30

## Nnet training options
effective_learning_rate=0.000001
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you copied this from a setup with more data, it may make sense to increase it.
but with this little data, you may not get any improvement at all so it may be hard to tune this value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I refer to some scripts for the "effective_learning_rate" option.
In tedlium, it is 1.2510^-6. In librispeech, it is 110^-6. In swbd, it is 1.25*10^-7.
But they are all much bigger than mini_librispeech.
I don't have enough experience for it. So do you have any suggestive value for this option?

@danpovey
Copy link
Contributor

danpovey commented Jun 25, 2017 via email

@LvHang
Copy link
Contributor Author

LvHang commented Jun 25, 2017

Ok, thanks.

@LvHang
Copy link
Contributor Author

LvHang commented Jun 26, 2017

I check the steps/nnet3/train_discriminative.sh, it doesn't contain the "--adjust-priors" and "--modify-learning-rates" options.
But the discriminative scripts of swbd(swbd/s5c/local/chain/tuning/run_tdnn_6h_discriminative.sh) and librispeech(librispeech/s5/local/chain/run_tdnn_discriminative.sh ) still use them.
So I guess they are out-of-date, right?
Hang

@danpovey
Copy link
Contributor

danpovey commented Jun 26, 2017 via email

@vimalmanohar
Copy link
Contributor

vimalmanohar commented Jun 26, 2017 via email

@danpovey
Copy link
Contributor

danpovey commented Jun 26, 2017 via email

@LvHang
Copy link
Contributor Author

LvHang commented Jun 26, 2017

Yes, i see.

@LvHang
Copy link
Contributor Author

LvHang commented Jun 26, 2017

@danpovey @vimalmanohar
I think when we use steps/nnet3/decode.sh to decode the discriminative model, the script will call "steps/nnet2/check_ivectors_compatible.sh" to check the ivectors compatible between "ivector-dir" and "discriminative dir". But we have never copied the "final.ie.id" of the base model dir(eg. exp/chain/tdnn1c_sp). So that it always prompt warning, right?
Hang

@danpovey
Copy link
Contributor

danpovey commented Jun 26, 2017 via email

@LvHang
Copy link
Contributor Author

LvHang commented Jun 26, 2017

@danpovey @vimalmanohar
I think there is also a mistake. Now the adjust-priors was made true by default. so that the "steps/nnet3/adjust_prior.sh" always will be called. This script will generate the ${dir}/${iter}_adj.mdl. However, in swbd corpus, when it run "steps/nnet3/decode.sh", the "--iter epoch$x.adj" will be set, that means it will use ${dir}/${iter}.adj.mdl model. Obviously, the name is wrong. I will fix it in my PR.
For a question, when we do decoding in discriminative script, which kind of model we always use by default, the "adj.mdl" or ".mdl"?
Hang

@danpovey
Copy link
Contributor

danpovey commented Jun 27, 2017 via email

@LvHang
Copy link
Contributor Author

LvHang commented Jun 27, 2017

Hi Dan,
This is the newest results about discriminative training. (offline-decoding)

tglarge_dev_clean_2 tgsmall_dev_clean_2
baseline 10.85 14.9
epoch1 10.16 14.3
epoch1_adj 10.13 14.26
epoch2 *10.11 14.33
epoch2_adj 10.21 *14.23
epoch3 10.41 14.49
epoch3_adj 10.36 14.37
epoch4 10.53 14.56
epoch4_adj 10.5 14.49

The "_adj" means I use the "adjust_prior" model. The one without "_adj" means the normal model.
The numerical value with a asterisk in front of it means the best result.

Hang

echo "$0: LDA transforms differ between $srcdir and $transform_dir"
exit 1;
fi
trans=raw_trans;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't use a variable here, just replace $trans with raw_trans

@LvHang
Copy link
Contributor Author

LvHang commented Jun 29, 2017

I have fixed it. Sorry for the carelessness.

@danpovey
Copy link
Contributor

merging-- thanks.

@danpovey danpovey merged commit 3505e86 into kaldi-asr:master Jun 29, 2017
Skaiste pushed a commit to Skaiste/idlak that referenced this pull request Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants