You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for this great implementation. I'm learning a lot from it :)
I noticed that the commit 0dc46e4 makes the internal AttentionPool weight initialized with randomly-sampled values rather than 2 * Identity-matrix, which is specified in the Enformer paper. (If there's something I am missing, please let me know!)
Indeed, this will not affect the performance of Enformer loaded with pretrained parameters, but I think it may lead to slightly worse (according to the paper) performance when trained from scratch.
Perhaps some simple manual weight initialization like
Hi, thanks for this great implementation. I'm learning a lot from it :)
I noticed that the commit 0dc46e4 makes the internal
AttentionPool
weight initialized with randomly-sampled values rather than 2 * Identity-matrix, which is specified in the Enformer paper. (If there's something I am missing, please let me know!)Indeed, this will not affect the performance of Enformer loaded with pretrained parameters, but I think it may lead to slightly worse (according to the paper) performance when trained from scratch.
Perhaps some simple manual weight initialization like
will do.
If you think it'll be okay, please let me know then I'll open a PR right away.
Thanks again for this great repo!
Best,
Dohoon
The text was updated successfully, but these errors were encountered: