-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Dropout component in netted to be used ephemeral connection. #1032
Conversation
public: | ||
void Init(int32 dim, BaseFloat dropout_proportion = 0.0); | ||
|
||
DropoutComponent(int32 dim, BaseFloat dp = 0.0) { Init(dim, dp); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dp is a too-short name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
|
||
DropoutComponent(int32 dim, BaseFloat dp = 0.0) { Init(dim, dp); } | ||
|
||
DropoutComponent():dim_(0), dropout_proportion_(0.0) { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after colon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
You haven't added DropoutComponent to the components used in testing. So it's not being tested. It's not clear that the tests would even work without modification- it might be necessary to reset the random number generators for some of the comparisons. |
Also we need a way to set the dropout-proportions via the command line. On Fri, Sep 16, 2016 at 6:10 PM, pegahgh notifications@github.com wrote:
|
|
Please push your changes, and also rebase to clean up the commit history- I prefer just one commit. |
I pushed the new changes after rebasing (Just single commit!!!!) On Sep 18, 2016 3:13 PM, "Daniel Povey" notifications@github.com wrote:
|
7e503a2
to
b42f818
Compare
@pegahgh, please go through all comments on this page and make sure you have addressed them-- same for the other PR you have. There are outstanding comments where the issue is not fixed. |
@danpovey I reviewed all comments on this page and I found all the issues fixed. Maybe the issue is that I did rebase and the fixed version of codes don't appear in conversation! |
Oh-- I just checked the conversation, it seems the files were updated if I click on the right tab. |
Maybe the reason is that I commented on conversation after committing changes, so it is not out-dated!! |
} | ||
} | ||
|
||
void Nnet::RemoveDropout() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function would not work.. nnet3 isn't just a linear sequence of components. Just remove this function for now.
public: | ||
// This function is required in testing code and in other places we need | ||
// consistency in the random number generation (e.g. when optimizing | ||
// validation-set performance), but check where else we call sRand(). You'll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be srand not sRand. this appears elsewhere too, I think.
@danpovey I can easily test ephemeral connection using dropout on TDNN! |
Before that can be done, I need to do some coding to make it possible to remove the dropout components. I want to make this part of a more general framework for editing nnets, that will also be used in your multilingual stuff. So I think it might be better to wait a few days. But I guess if you want to start work on the scripting components, and you're not otherwise busy, it's OK with me. |
// effective. | ||
|
||
// Removes all dropout components from network, which can be used during test time. | ||
void RemoveDropout(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please take out this RemoveDropout()
function?
I have been thinking about it, and I think the best way to do this would be at the config level: a file "remove-ephemeral-connections.config" would be created that modifies the nodes that sum over the real and emphemeral connections by replacing expressions like Add(real_node, ephmeral_node)
with real_node
.
Then we can clean up the model by taking out orphan nodes and components; I have written the code for this and will commit it at some point in future.
0d37b1a
to
e866023
Compare
Removed RemoveDropout function. |
|
||
KALDI_ASSERT(in_value.NumRows() == out_deriv.NumRows() && | ||
in_value.NumCols() == out_deriv.NumCols()); | ||
in_deriv->AddMatMatDivMat(out_deriv, out_value, in_value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this is adding, you should include kBackpropAdds in the Properties().
You should also add the kLinearInInput flag.
@@ -143,7 +143,9 @@ void DropoutComponent::Backprop(const std::string &debug_info, | |||
|
|||
KALDI_ASSERT(in_value.NumRows() == out_deriv.NumRows() && | |||
in_value.NumCols() == out_deriv.NumCols()); | |||
in_deriv->AddMatMatDivMat(out_deriv, out_value, in_value); | |||
CuMatrix<BaseFloat> in_deriv_inplace(in_deriv->NumRows(), in_deriv->NumCols()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why you made these changes? This shouldn't have been necessary. Let me know if there was a test failure when you added the flag, the cause might be something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added kBackpropAdds and kLinearInInput to Properties().
It seems AddMatMatDivMat method doesn't add this to its output, which is against what we expect in most of AddMat functions.
I didn't modify this function and I supported kBackpropAdds in backprop by having temporary matrix and adding it to in_deriv!
we may need to modify AddMatMatDivMat method to do *this += a * b / c !!
OK, I see the problem. |
Should I rename the function in this PR, or you plan to do this later? |
fea7c32
to
05d2e30
Compare
rename it On Sun, Sep 25, 2016 at 8:17 PM, pegahgh notifications@github.com wrote:
|
@@ -989,7 +989,7 @@ void CuMatrixBase<Real>::AddMatBlocks(Real alpha, const CuMatrixBase<Real> &A, | |||
/// dst = a * b / c (by element; when c = 0, dst = a) | |||
/// dst can be an alias of a, b or c safely and get expected result. | |||
template<typename Real> | |||
void CuMatrixBase<Real>::AddMatMatDivMat(const CuMatrixBase<Real> &A, | |||
void CuMatrixBase<Real>::SetMatMatDivMat(const CuMatrixBase<Real> &A, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't do this now we'll forget about it-- can you also please change the kernel names to remove the add and make it set ?
return stream.str(); | ||
} | ||
|
||
void DropoutComponent::Propagate(const ComponentPrecomputedIndexes *indexes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this implementation of drop-out you are using a different mask for each sample in the minibatch. I don't know if other versions of drop-out do this (can anyone clarify ?). If they don't you should probably highlight this in the comments. I think this version is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Vijay, I am not aware of a paper where would be mentioned that the same dropout mask should be used for all the samples within the mini-batch. Yes, I am assuming that all the samples are independent, having different topologies of NN, so each sample has a different mask. Has anybody tried having it the other way? Thanks!
K.
Karel's did. I expect they probably normally do it that way but don't know On Tue, Sep 27, 2016 at 11:50 PM, Vijayaditya Peddinti <
|
@pegahgh I was curious about the expected completion date for this component. |
The only thing that has to be done is to rename some kernels. |
oh! I will rename them right now! |
7bfa02e
to
f5691bb
Compare
Done! |
// random components. You must also set srand() for this to be | ||
// effective. | ||
|
||
>>>>>>> 05d2e30... added DropoutComponent to nnet3 for ephemeral connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you did not completely resolve these conflicts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry! fixed!
f5691bb
to
a48d6b3
Compare
@@ -1219,6 +1219,15 @@ static void GenerateRandomComponentConfig(std::string *component_type, | |||
os << " self-repair-target=" << RandUniform(); | |||
break; | |||
} | |||
<<<<<<< HEAD | |||
======= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here.. check for === via grep et.
Also, please make sure this compiles with GPU... if you had compiled, you wouldn't have missed these errors. |
fixed type and it compiled with GPU! |
No description provided.