-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove unused feature types in nnet3 scripts #1711
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After addressing these comments, you can test it in the mini_librispeech setup.
If there is no discriminative-training example script there (and there probably isn't), please ask @hhadian to help you add one there, or @vimalmanohar if Hossein is not available and Vimal has time.
|
||
if [ -f $alidir/final.mat ]; then feat_type=lda; else feat_type=delta; fi | ||
echo "$0: feature type is $feat_type" | ||
echo "$0: feature type is delta" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry-- you should actually leave this file as it was. it uses the features from a baseline GMM-based system, which will normally be LDA.
egs/wsj/s5/steps/nnet3/decode.sh
Outdated
echo "$0: LDA transforms differ between $srcdir and $transform_dir" | ||
exit 1; | ||
fi | ||
trans=raw_trans; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to use a variable now since it can have only one value.
echo "$0: LDA transforms differ between $srcdir and $transform_dir" | ||
exit 1; | ||
fi | ||
trans=raw_trans; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, remove the variable.
egs/wsj/s5/steps/nnet3/get_degs.sh
Outdated
@@ -13,7 +13,7 @@ cmd=run.pl | |||
max_copy_jobs=5 # Limit disk I/O | |||
|
|||
# feature options | |||
feat_type=raw # set it to 'lda' to use LDA features. | |||
feat_type=raw # set it to 'delta' to use delta features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disable delta features here; this only runs on top of already-trained chain system so if we remove delta in the basic chain training scripts (which you should), it has no reason to exist here.
fi | ||
fi | ||
if [ -f $transform_dir/raw_trans.1 ] && [ $feat_type == "raw" ]; then | ||
if [ -f $transform_dir/raw_trans.1 ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vimalmanohar, are we ever using get_egs_target.sh with other than raw features?
egs/wsj/s5/steps/nnet3/lstm/train.sh
Outdated
@@ -103,7 +103,6 @@ egs_opts= | |||
transform_dir= # If supplied, this dir used instead of alidir to find transforms. | |||
cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to change this script, it is deprecated so I don't want to waste time testing the change.
@@ -34,8 +34,8 @@ extra_left_context=0 | |||
extra_right_context=0 | |||
extra_left_context_initial=-1 | |||
extra_right_context_final=-1 | |||
feat_type= # you can set this in order to run on top of delta features, although we don't | |||
# normally want to do this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to have this variable any more. This script doesn't need to support delta features or any features exept raw, and there is no need for a "feat_type" variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird, I updated it but it isn't changed here. But the file in "Files changed" tab is modified.
egs/wsj/s5/steps/nnet3/tdnn/train.sh
Outdated
@@ -71,7 +71,6 @@ egs_opts= | |||
transform_dir= # If supplied, this dir used instead of alidir to find transforms. | |||
cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied. | |||
# only relevant for "raw" features, not lda. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can revert changes to this script, the script is deprecated.
egs/wsj/s5/steps/nnet3/train_tdnn.sh
Outdated
@@ -70,7 +70,6 @@ egs_opts= | |||
transform_dir= # If supplied, this dir used instead of alidir to find transforms. | |||
cmvn_opts= # will be passed to get_lda.sh and get_egs.sh, if supplied. | |||
# only relevant for "raw" features, not lda. | |||
feat_type=raw # or set to 'lda' to use LDA features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also revert changes to this script.
Hi Dan, |
@vimalmanohar @hhadian |
frames_overlap_per_eg=30 | ||
|
||
## Nnet training options | ||
effective_learning_rate=0.000001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you copied this from a setup with more data, it may make sense to increase it.
but with this little data, you may not get any improvement at all so it may be hard to tune this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I refer to some scripts for the "effective_learning_rate" option.
In tedlium, it is 1.2510^-6. In librispeech, it is 110^-6. In swbd, it is 1.25*10^-7.
But they are all much bigger than mini_librispeech.
I don't have enough experience for it. So do you have any suggestive value for this option?
Try 10 times larger.
…On Sun, Jun 25, 2017 at 1:51 AM, LvHang ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In egs/mini_librispeech/s5/local/chain/tuning/run_tdnn_1c_
discriminative.sh
<#1711 (comment)>:
> +tree_dir=exp/chain/tree_sp
+degs_dir= # If provided, will skip the degs directory creation
+lats_dir= # If provided, will skip denlats creation
+
+## Objective options
+criterion=smbr
+one_silence_class=true
+
+dir=${srcdir}_${criterion}
+
+## Egs options
+frames_per_eg=150
+frames_overlap_per_eg=30
+
+## Nnet training options
+effective_learning_rate=0.000001
Yeah, I refer to some scripts for the "effective_learning_rate" option.
In tedlium, it is 1.25*10^-6. In librispeech, it is 1*10^-6. In swbd, it
is 1.25*10^-7.
But they are all much bigger than mini_librispeech.
I don't have enough experience for it. So do you have any suggestive value
for this option?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1711 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu97I3ybtL-hJ-axnjRJXsRKomfebks5sHfVsgaJpZM4OEN1Z>
.
|
Ok, thanks. |
I check the steps/nnet3/train_discriminative.sh, it doesn't contain the "--adjust-priors" and "--modify-learning-rates" options. |
@vimalmanohar, any idea what is the problem here?
…On Sun, Jun 25, 2017 at 10:59 PM, LvHang ***@***.***> wrote:
I check the steps/nnet3/train_discriminative.sh, it doesn't contain the
"--adjust-priors" and "--modify-learning-rates" options.
But the discriminative scripts of swbd(swbd/s5c/local/chain/
tuning/run_tdnn_6h_discriminative.sh) and librispeech(librispeech/s5/
local/chain/run_tdnn_discriminative.sh ) still use them.
So I guess they are out-of-date, right?
Hang
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1711 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu8DhFVYsPobN4SpiVOIvDLO6drcjks5sHx6OgaJpZM4OEN1Z>
.
|
adjust-priors was made true by default and the option is removed. Also
modify-learning-rates is no longer required and is not supported anymore.
On Sun, Jun 25, 2017 at 11:02 PM Daniel Povey <notifications@github.com>
wrote:
@vimalmanohar, any idea what is the problem here?
On Sun, Jun 25, 2017 at 10:59 PM, LvHang ***@***.***> wrote:
> I check the steps/nnet3/train_discriminative.sh, it doesn't contain the
> "--adjust-priors" and "--modify-learning-rates" options.
> But the discriminative scripts of swbd(swbd/s5c/local/chain/
> tuning/run_tdnn_6h_discriminative.sh) and librispeech(librispeech/s5/
> local/chain/run_tdnn_discriminative.sh ) still use them.
> So I guess they are out-of-date, right?
> Hang
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#1711 (comment)>,
or mute
> the thread
> <
https://github.com/notifications/unsubscribe-auth/ADJVu8DhFVYsPobN4SpiVOIvDLO6drcjks5sHx6OgaJpZM4OEN1Z
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1711 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEATV2QbH1AzD2dxQ-vlwXADqc-kGT5lks5sHx8ugaJpZM4OEN1Z>
.
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University
|
OK, @hanglv, please search through all example scripts for the use of
those options and remove them. (obviously excluding nnet2 scripts).
On Sun, Jun 25, 2017 at 11:12 PM, Vimal Manohar <notifications@github.com>
wrote:
… adjust-priors was made true by default and the option is removed. Also
modify-learning-rates is no longer required and is not supported anymore.
On Sun, Jun 25, 2017 at 11:02 PM Daniel Povey ***@***.***>
wrote:
> @vimalmanohar, any idea what is the problem here?
>
>
> On Sun, Jun 25, 2017 at 10:59 PM, LvHang ***@***.***>
wrote:
>
> > I check the steps/nnet3/train_discriminative.sh, it doesn't contain
the
> > "--adjust-priors" and "--modify-learning-rates" options.
> > But the discriminative scripts of swbd(swbd/s5c/local/chain/
> > tuning/run_tdnn_6h_discriminative.sh) and librispeech(librispeech/s5/
> > local/chain/run_tdnn_discriminative.sh ) still use them.
> > So I guess they are out-of-date, right?
> > Hang
> >
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub
> > <#1711 (comment)>,
> or mute
> > the thread
> > <
> https://github.com/notifications/unsubscribe-auth/
ADJVu8DhFVYsPobN4SpiVOIvDLO6drcjks5sHx6OgaJpZM4OEN1Z
> >
> > .
> >
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#1711 (comment)>,
or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AEATV2QbH1AzD2dxQ-
vlwXADqc-kGT5lks5sHx8ugaJpZM4OEN1Z>
> .
>
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1711 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu0-z0xede2mjT92sOnEUn2heKyIAks5sHyGigaJpZM4OEN1Z>
.
|
Yes, i see. |
@danpovey @vimalmanohar |
Probably the discriminative-training script should copy that-- please
include that in your PR. But make sure the script doesn't fail if the
final.ie.id is not present in the source directory. If the script does set
-e at the top, then you may have to do something like
cp $srcdir/final.ie.id $dir 2>/dev/null || true
to suppress any error.
…On Mon, Jun 26, 2017 at 5:32 PM, LvHang ***@***.***> wrote:
@danpovey <https://github.com/danpovey> @vimalmanohar
<https://github.com/vimalmanohar>
I think when we use steps/nnet3/decode.sh to decode the discriminative
model, the script will call "steps/nnet2/check_ivectors_compatible.sh" to
check the ivectors compatible between "ivector-dir" and "discriminative
dir". But we have never copied the "final.ie.id" of the base model
dir(eg. exp/chain/tdnn1c_sp). So that it always prompt warning, right?
Hang
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1711 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu9A2DyFN37GvUTI6REFCZSu8JBQDks5sICOHgaJpZM4OEN1Z>
.
|
@danpovey @vimalmanohar |
I think you should try decoding both of them and see which is better. I
don't have a lot of experience with the discriminative training and I don't
recall which is normally better. Probably the adjusted one, but I'm not
sure.
…On Mon, Jun 26, 2017 at 7:56 PM, LvHang ***@***.***> wrote:
@danpovey <https://github.com/danpovey> @vimalmanohar
<https://github.com/vimalmanohar>
I think there is also a mistake. Now the adjust-priors was made true by
default. so that the "steps/nnet3/adjust_prior.sh" always will be called.
This script will generate the ${dir}/${iter}_adj.mdl. However, in swbd
corpus, when it run "steps/nnet3/decode.sh", the "--iter epoch$x.adj" will
be set, that means it will use ${dir}/${iter}.adj.mdl model. Obviously,
the name is wrong. I will fix it in my PR.
For a question, when we do decoding in discriminative script, which kind
of model we always use by default, the "adj.mdl" or ".mdl"?
Hang
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1711 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADJVu7uyR5AAdf1EPpxsi5U15jAj3tk4ks5sIEUZgaJpZM4OEN1Z>
.
|
Hi Dan,
The "_adj" means I use the "adjust_prior" model. The one without "_adj" means the normal model. Hang |
egs/wsj/s5/steps/nnet3/align.sh
Outdated
echo "$0: LDA transforms differ between $srcdir and $transform_dir" | ||
exit 1; | ||
fi | ||
trans=raw_trans; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't use a variable here, just replace $trans
with raw_trans
I have fixed it. Sorry for the carelessness. |
merging-- thanks. |
… (LDA, delta); add sMBR recipe for mini-librispeech (kaldi-asr#1711)
Hi Dan,
I checked all the files in "egs//local/nnet3", "egs//local/chain" and "steps/nnet3" directories to make sure none of them set the "--feat-type" option.
I deleted the 'lda' branch in "steps/nnet3/ scripts" and fixed the codes about "transform_dir". Now, the scripts only execute the feat-transform about "raw".
But there are three exceptions.
(1)"steps/nnet3/chain/build_tree.sh" looks only support "delta" feature. (2)"egs/wsj/s5/steps/nnet3/get_degs.sh" and "steps/nnet3/make_denlats.sh" support "raw" and "delta".
In the comment of "steps/nnet3/make_denlats.sh", it said "you can set this in order to run on top of delta features, although we don't normally want to do this." So I keep it. I don't know if I should delete it.
Now, I'm testing it. When the testing finish, I will inform you.
Hang