Initializing AttentionPool weights with 2 * Identity matrix. #21

dohlee · 2023-02-11T15:55:00Z

Hi, thanks for this great implementation. I'm learning a lot from it :)

I noticed that the commit 0dc46e4 makes the internal AttentionPool weight initialized with randomly-sampled values rather than 2 * Identity-matrix, which is specified in the Enformer paper. (If there's something I am missing, please let me know!)

Indeed, this will not affect the performance of Enformer loaded with pretrained parameters, but I think it may lead to slightly worse (according to the paper) performance when trained from scratch.

Perhaps some simple manual weight initialization like

self.to_attn_logits = nn.Conv2d(dim, dim, 1, bias = False)
self.to_attn_logits.data.zero_()
self.to_attn_logits.data.squeeze().fill_diagonal_(2)

will do.

If you think it'll be okay, please let me know then I'll open a PR right away.

Thanks again for this great repo!

Best,
Dohoon

The text was updated successfully, but these errors were encountered:

lucidrains · 2023-02-11T16:36:34Z

@dohlee Hi Dohoon! Thank you for catching this and glad to see another researcher applying attention to genomics! I've made the change here

lucidrains added a commit that referenced this issue Feb 11, 2023

address #21

5dc54a8

lucidrains added a commit that referenced this issue Feb 11, 2023

address #21

4e70710

lucidrains closed this as completed Feb 11, 2023

dohlee mentioned this issue Feb 12, 2023

Initializing AttentionPool weights with 2 * Identity matrix, again! #22

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initializing AttentionPool weights with 2 * Identity matrix. #21

Initializing AttentionPool weights with 2 * Identity matrix. #21

dohlee commented Feb 11, 2023

lucidrains commented Feb 11, 2023

Initializing AttentionPool weights with 2 * Identity matrix. #21

Initializing AttentionPool weights with 2 * Identity matrix. #21

Comments

dohlee commented Feb 11, 2023

lucidrains commented Feb 11, 2023