We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
new paper suggests that feedforwards needs to be present
release
Update setup.py
bump
fix post attn layer norm when using embedding factorization
add a norm post-attention layers
fix autopadder for reformer class as well
fix bug with memory key/values
do not assert that dimension is divisible by heads if dim_head is sup… …plied