Nnet1 dropout ivec #1090

KarelVesely84 · 2016-10-03T14:47:51Z

adding support to annealed dropout,
created an example how to prepare kaldi i-vectors on fMLLR features (AMI,IHM)
nice gains on AMI IHM (dev, eval): w/o i-vector 24.2 24.5, per-spk i-vector 23.2 22.8, (Dan's lattice-free MMI 22.4 22.4)

- changing the scripts to support the 'annealed dropout',

danpovey · 2016-10-03T18:44:31Z

egs/wsj/s5/steps/nnet/train_scheduler.sh

@@ -22,7 +22,7 @@ feature_transform=
 max_iters=20
 min_iters=0 # keep training, disable weight rejection, start learn-rate halving as usual,
 keep_lr_iters=0 # fix learning rate for N initial epochs, disable weight rejection,
-dropout_iters= # Disable dropout after 'N' initial epochs,
+dropout_schedule= # Dropout schedule for N initial epochs, for example: 0.9,0.9,0.9,0.9,0.9,1.0


Are these probabilities the probability of dropout, or the probability of not dropping out?
I think when people normally describe dropout, it's the probability of setting the feature to zero, e.g. see
https://pdfs.semanticscholar.org/c2d7/8722ebac92766f1154497d8424108d906ae3.pdf

Perhaps if you renamed this to dropout_retention_schedule it would lead to less confusion?

Dan, how do you have it in nnet2, nnet3? As the probability that neuron is dropped? It would be good to have it the same way, I will change it then...
(Actually, I was already thinking about it and discussed it with Harish)

- there's backward compatibility in Dropout::Read()

mallidi · 2016-10-04T15:58:49Z

Hi Karel,
Sorry for complaining.
dropout_schedule seems to be clear. But dropout_retention_schedule might be better, as you are converting dropout_schedule into retention_schedule.

Harish

KarelVesely84 · 2016-10-04T16:17:51Z

Hi Harish (@mallidi),
in the last commit I changed the 'dropout-retention' to 'dropout-rate'.

This is a more standard formulation as @danpovey pointed out. Given that the values in 'dropout_schedule' are the probabilities that the neurons are dropped, we can use original variable name 'dropout_schedule'.

The c++ code knows how to read the older models with 'DropoutRetention' instead of 'DropoutRate', so there is backward compatibility...

Is it okay for your needs?
K.

mallidi · 2016-10-04T16:21:33Z

Sure @vesis84 Thanks a lot for the annealed dropout

Harish.

KarelVesely84 · 2016-10-04T16:50:15Z

You are welcome ;)

What seemed to work well on 'ami-ihm' was dropout-rate 0.2 in 5 initial epochs, then switching it to 0.0 (no dropout). This starts the learning rate decay, as without dropout the cross-entropy immediately increases on 'cv' data (and massively decreases on 'tr' data)...

This schedule was better than 0.5 0.4 0.3 0.2 0.1 0.0, and some other combinations.
Harish, I'm sure you might be more lucky with the annealed version :)

In the implementation there is one detail, after applying the dropout mask the output is up-scaled by 1/(1-p_drop). While, the cross-validations are always without dropout (hard-coded in the training binaries). So the up-scaling does a good job, there doesn't seem to be a severe mismatch caused by disabling the dropout in the cross-validation step... [I tried to exponentiate the 1/(1-p_drop) to make it little larger / smaller, but this was causing a mismatch visible on 'cv' loss]

K.

KarelVesely84 · 2016-10-04T17:09:56Z

@danpovey I am done with the changes, from my side it's ready.

KarelVesely84 added 2 commits October 3, 2016 16:33

nnet1: dropout,

dfc18fe

- changing the scripts to support the 'annealed dropout',

nnet1: AMI IHM recipe with kaldi i-vectors (per-spk),

3ab8ef4

danpovey reviewed Oct 3, 2016

View reviewed changes

nnet1,dropout: replacing dropout-retention by dropout-rate,

43bf068

- there's backward compatibility in Dropout::Read()

danpovey merged commit 98ab7d2 into kaldi-asr:master Oct 4, 2016

KarelVesely84 deleted the nnet1_dropout_ivec branch November 6, 2017 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nnet1 dropout ivec #1090

Nnet1 dropout ivec #1090

KarelVesely84 commented Oct 3, 2016 •

edited

Loading

danpovey Oct 3, 2016

danpovey Oct 3, 2016

KarelVesely84 Oct 4, 2016

mallidi commented Oct 4, 2016

KarelVesely84 commented Oct 4, 2016 •

edited

Loading

mallidi commented Oct 4, 2016

KarelVesely84 commented Oct 4, 2016

KarelVesely84 commented Oct 4, 2016

Nnet1 dropout ivec #1090

Nnet1 dropout ivec #1090

Conversation

KarelVesely84 commented Oct 3, 2016 • edited Loading

danpovey Oct 3, 2016

Choose a reason for hiding this comment

danpovey Oct 3, 2016

Choose a reason for hiding this comment

KarelVesely84 Oct 4, 2016

Choose a reason for hiding this comment

mallidi commented Oct 4, 2016

KarelVesely84 commented Oct 4, 2016 • edited Loading

mallidi commented Oct 4, 2016

KarelVesely84 commented Oct 4, 2016

KarelVesely84 commented Oct 4, 2016

KarelVesely84 commented Oct 3, 2016 •

edited

Loading

KarelVesely84 commented Oct 4, 2016 •

edited

Loading