Wrong PPO Model architecture. #26

alirezakazemipour · 2020-10-06T16:09:33Z

According to the DQN nature paper and PPO1 implementation, this line:

X = activ(conv(X, 'c3', nf=64, rf=4, stride=1, init_scale=np.sqrt(2), data_format=data_format))

should be changed to:

X = activ(conv(X, 'c3', nf=64, rf=3, stride=1, init_scale=np.sqrt(2), data_format=data_format))

In short, kernel size is wrong!

xiaioding · 2023-04-10T04:42:14Z

这两行有什么区别？

alirezakazemipour · 2023-04-11T14:11:05Z

@xiaioding
The difference is in kernel sizes (rf.)

Provide feedback