Skip to content



Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation

Awesome Retrieval papers

The main goal is to collect classical and solid works of retrieval in academia and industry.


Image Retrieval

Real-valued Feature

  • Forward Compatible Training for Large-Scale Embedding Retrieval Systems (CVPR2022) [paper]
  • Correlation Verification for Image Retrieval (CVPR2022) [paper]
  • Effective Conditioned and Composed Image Retrieval Combining CLIP-Based Features (CVPR2022) [paper]
  • Sketching Without Worrying: Noise-Tolerant Sketch-Based Image Retrieval (CVPR2022) [paper]
  • FashionVLP: Vision Language Transformer for Fashion Retrieval With Feedback (CVPR2022) [paper]
  • Beyond Cross-View Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image (CVPR2022) [paper]
  • Contextual Similarity Distillation for Asymmetric Image Retrieval (CVPR2022) [paper]
  • Correlation Verification for Image Retrieval (CVPR2022) [paper]
  • Effective Conditioned and Composed Image Retrieval Combining CLIP-Based Features (CVPR2022) [paper]
  • Beyond Cross-View Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image (CVPR2022) [paper]
  • (CVPR2022) [paper]
  • Feature Representation Learning for Unsupervised Cross-domain Image Retrieval (ECCV2022) [paper]
  • Hierarchical Average Precision Training for Pertinent Image Retrieval (ECCV2022) [paper][code]
  • PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation (ECCV2022) [paper][code]
  • Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval (ECCV2022) [paper]
  • LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval (ECCV2022) [paper]

Binary Feature

  • One Loss for Quantization: Deep Hashing With Discrete Wasserstein Distributional Matching (CVPR2022) [paper][code]
  • Deep Hash Distillation for Image Retrieval (ECCV2022) [paper][code]
  • SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval (ECCV2022) [paper]

Cross-Modal Retrieval

Real-valued Feature

  • Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval (CVPR2022) [paper]
  • Bridging Video-Text Retrieval With Multiple Choice Questions (CVPR2022) [paper]
  • Object-Aware Video-Language Pre-Training for Retrieval (CVPR2022) [paper]
  • Cross Modal Retrieval With Querybank Normalisation (CVPR2022) [paper]
  • EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval (CVPR2022) [paper]
  • Sketching Without Worrying: Noise-Tolerant Sketch-Based Image Retrieval (CVPR2022) [paper]
  • AxIoU: An Axiomatically Justified Measure for Video Moment Retrieva (CVPR2022) [paper]
  • COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval (CVPR2022) [paper]
  • ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval (CVPR2022) [paper]
  • FashionVLP: Vision Language Transformer for Fashion Retrieval With Feedback (CVPR2022) [paper]
  • Sign Language Video Retrieval With Free-Form Textual Queries (CVPR2022) [paper]
  • UMT: Unified Multi-Modal Transformers for Joint Video Moment Retrieval and Highlight Detection (CVPR2022) [paper]
  • X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval (CVPR2022) [paper]
  • ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound (ECCV2022) [paper][project][code]
  • VTC: Improving Video-Text Retrieval with User Comments (ECCV2022) [paper][code]
  • Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval (ECCV2022) [paper]
  • MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval (ECCV2022) [paper]
  • Adaptive Fine-Grained Sketch-Based Image Retrieval (ECCV2022) [paper]
  • Multi-Query Video Retrieval (ECCV2022) [paper]
  • Selective Query-guided Debiasing for Video Corpus Moment Retrieval (ECCV2022) [paper]
  • TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval (ECCV2022) [paper]
  • Learning Linguistic Association Towards Efficient Text-Video Retrieval (ECCV2022) [paper]
  • Granularity-aware Adaptation for Image Retrieval over Multiple Tasks (ECCV2022) [paper]
  • Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval (ECCV2022) [paper]
  • Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment (ECCV2022) [paper]
  • Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval (ECCV2022) [paper]
  • CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval (ECCV2022) [paper]
  • A Sketch Is Worth a Thousand Words:Image Retrieval with Text and Sketch (ECCV2022) [paper][project]

Binary Feature

  • Mutual Quantization for Cross-Modal Search With Noisy Labels (CVPR2022) [paper]

Real-World Scenarios

Person/Vehicle Re-Identification

Incremental/Continual Learning

Fine-Grained Retrieval

Copy/Duplicate Detection

  • Perceptual Image Hashing With Locality Preserving Projection for Copy Detection (TDSC2023) [paper]
  • Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants for Copy-Move Forgery Detection (TIFS2023) [paper]
  • Efficient Hashing Method Using 2D-2D PCA for Image Copy Detection (TKDE2023) [paper]
  • Robust image hashing with Isomap and saliency map for copy detection (TMM2023) [paper]
  • A Self-Supervised Descriptor for Image Copy Detection (CVPR2022) [paper] [code]
  • A Robust and Fast Video Copy Detection System Using Content-Based Fingerprinting (TIFS2011) [paper]

Long-Tail Visual Recognition

  • Retrieval Augmented Classification for Long-Tail Visual Recognition (CVPR2022) [paper]

Classical Local Feature

Deep Learning Feature (Global Feature)

Online Invariance Selection for Local Feature Descriptors, ECCV 2020, code.

Deep Learning Feature (Local Feature)

Deep Learning Feature (Instance Search)

ANN search

CBIR Attack

CBIR rank

CBIR in Industry

CBIR Competition and Challenge

Feature Fusion

Instance Matching

Semantic Matching

Template Matching

Image Identification



Demo and Demo Online


  • DeepFashion2 Dataset, DeepFashion2 is a comprehensive fashion dataset.
  • Holidays, Holidays consists images from personal holiday albums of various scene types.
  • Oxford, Oxford consists of 11 different Oxford landmarks.
  • Paris, Paris consists of images crawled from 11 queries on specific Paris architecture.
  • ROxford and RParis, ROxford and RParis are revisited versions of the original Oxford and Paris with annotation corrections, enlarged sizes and more difficult samples.
  • INSTRE, INSTRE is an instance-level object retrieval dataset.

Useful Package

Star History

Star History Chart


📝Awesome and classical image retrieval papers






No releases published


No packages published