Stars
Official repository of Slide-Transformer (CVPR2023)
Pytorch implementation for Image Captioning.
A Library for Advanced Deep Time Series Models.
The official repository of the paper "Learning Correlation Structures for Vision Transformers" accepted to CVPR 2024.
[NAACL 2024] LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-text Generation?
A PyTorch reimplementation of bottom-up-attention models
Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]
Grid features pre-training code for visual question answering
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
Official pytorch implementation of paper "Dual-Level Collaborative Transformer for Image Captioning" (AAAI 2021).
Implementation of 'End-to-End Transformer Based Model for Image Captioning' [AAAI 2022]
Meshed-Memory Transformer for Image Captioning. CVPR 2020
Torch implementation of ResNet from http://arxiv.org/abs/1512.03385 and training scripts
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
Vision-Language Pre-training for Image Captioning and Question Answering
Simple image captioning model
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
A curated list of image captioning and related area resources. :-)
Efficient computing methods developed by Huawei Noah's Ark Lab
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解