Tensor sizes input in ConvNet and RNN #12

AlexTS1980 · 2019-07-04T15:43:07Z

Thanks for the code. I have a couple of questions regarding tensor sizes.

The dataloader creates tensors size X= (#videos, #frames, 3, H, W) and y=(#videos, 1). There's a loop in the train method for #videos, but in my implementation it only returned index=0, so the input in the ConvNet is size (#videos, #frames, 3, H, W). Is this correct?
In the ConvNet's forward method there's a loop for #frames in the video, it transforms the pool layer into a vector to get tensor (#videos, #frames, CNN_embed_dim), which is both the output of the ConvNet and input in the RNN. Is this right?

I don't quite understand how the RNN processes batch, i.e. the number of videos. Is there some internal loop for this that I can't find in the code?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor sizes input in ConvNet and RNN #12

Tensor sizes input in ConvNet and RNN #12

AlexTS1980 commented Jul 4, 2019

Tensor sizes input in ConvNet and RNN #12

Tensor sizes input in ConvNet and RNN #12

Comments

AlexTS1980 commented Jul 4, 2019