Skip to content

One fig to describe recent vision transformer(一图解释transformer)

Notifications You must be signed in to change notification settings

ZhuangLii/transformer_pipeline

Repository files navigation

transformer_pipeline

transformer_pipeline

Support

Performance Comparisons

Performance comparisons on ImageNet1K

method top-1 accuracy
ViT-B"384 77.9
SwinV1-B"384 84.2
CSWin-B"384 85.4
iRPE base DeiT-B"224 82.4
DAT-B"384 84.8
CvT-21"384 84.9
CrossViT-18"384 83.9
SwinV2-B"384 87.1

Figures

Attention is all you need Alt text


Vision Transformer Alt text


SwinTransformer Alt text


CSwinTransformer Alt text


DETR Alt text


iRPE: Rethinking Position Encoding Alt text


Deformable Attention Transformer Alt text


CvT: Introducing Convolutions to Vision Transformers Alt text


CrossViT Alt text SwinTrack Alt text Stark Alt text

About

One fig to describe recent vision transformer(一图解释transformer)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published