Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code details #61

Closed
Cancerce1l opened this issue Mar 22, 2019 · 5 comments
Closed

code details #61

Cancerce1l opened this issue Mar 22, 2019 · 5 comments

Comments

@Cancerce1l
Copy link

Cancerce1l commented Mar 22, 2019

So as I read in paper the offset of input image is sum of grid and offset map. but the code below is confusing for me. Can you kindly explain it a bit, thanks.
offsets_x = torch.cat([grid_x, grid_y + offsets_grid], 3)

and also, Does it require a particular initial weigth for morn, as I think that it might be the best to initialize identity transform matrix.

@Cancerce1l
Copy link
Author

Cancerce1l commented Mar 22, 2019

actually following the first question I asked. I noticed that the last convolution output map's activate func isn't tanh, but instead,
offsets_posi = nn.functional.relu(offsets, inplace=False)
offsets_nega = nn.functional.relu(-offsets, inplace=False)
offsets_pool = self.pool(offsets_posi) - self.pool(offsets_nega)
Is their any particular reason?
Thank u.

@Canjie-Luo
Copy link
Owner

Sorry for late reply.

  1. In MORAN v1, we found that the x-offset map didn’t result to significant improvement. Thus, we disabled it and only used y-offset map in v2. As for the question about the usage of the sampling function, please ref to the PyTorch document.

  2. We removed the Tanh() in v2 for more stable convergence. The pooling operation was also updated. The new operation extras the maximum absolute values on the offset map.

Hopefully this will help you.

@Cancerce1l
Copy link
Author

it helps a lot, thank u. I've trained a model based on x and y offset map, and it's doing poorly on the eval set. I think it's because of the x offset that affects the ctc decoding. Utill I figure how to fix this, I'll change to y-offset only too.
Thanks again.

@Canjie-Luo
Copy link
Owner

You're welcome!

@PkuDavidGuan
Copy link

@Canjie-Luo Hi, I still could not understand the code in

offsets_pool = self.pool(offsets_posi) - self.pool(offsets_nega)
. Why do you add the positive and negative offsets with maximum absolute values?

I guess that you assume the offsets within the 2*2 block are in a similar range (both positive or negative), but the assumption might not be proper in the very beginning of the training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants