Awesome Retrieval papers

CBIR in academia and industry

Awesome Retrieval papers

The main goal is to collect classical and solid works of retrieval in academia and industry.

Awesome Retrieval papers

Image Retrieval

Real-valued Feature

Forward Compatible Training for Large-Scale Embedding Retrieval Systems (CVPR2022) [paper]
Correlation Verification for Image Retrieval (CVPR2022) [paper]
Effective Conditioned and Composed Image Retrieval Combining CLIP-Based Features (CVPR2022) [paper]
Sketching Without Worrying: Noise-Tolerant Sketch-Based Image Retrieval (CVPR2022) [paper]
FashionVLP: Vision Language Transformer for Fashion Retrieval With Feedback (CVPR2022) [paper]
Beyond Cross-View Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image (CVPR2022) [paper]
Contextual Similarity Distillation for Asymmetric Image Retrieval (CVPR2022) [paper]
Correlation Verification for Image Retrieval (CVPR2022) [paper]
Effective Conditioned and Composed Image Retrieval Combining CLIP-Based Features (CVPR2022) [paper]
Beyond Cross-View Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image (CVPR2022) [paper]
(CVPR2022) [paper]
Feature Representation Learning for Unsupervised Cross-domain Image Retrieval (ECCV2022) [paper]
Hierarchical Average Precision Training for Pertinent Image Retrieval (ECCV2022) [paper][code]
PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation (ECCV2022) [paper][code]
Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval (ECCV2022) [paper]
LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval (ECCV2022) [paper]

Binary Feature

One Loss for Quantization: Deep Hashing With Discrete Wasserstein Distributional Matching (CVPR2022) [paper][code]
Deep Hash Distillation for Image Retrieval (ECCV2022) [paper][code]
SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval (ECCV2022) [paper]

Cross-Modal Retrieval

Real-valued Feature

Everything at Once - Multi-Modal Fusion Transformer for Video Retrieval (CVPR2022) [paper]
Bridging Video-Text Retrieval With Multiple Choice Questions (CVPR2022) [paper]
Object-Aware Video-Language Pre-Training for Retrieval (CVPR2022) [paper]
Cross Modal Retrieval With Querybank Normalisation (CVPR2022) [paper]
EI-CLIP: Entity-Aware Interventional Contrastive Learning for E-Commerce Cross-Modal Retrieval (CVPR2022) [paper]
Sketching Without Worrying: Noise-Tolerant Sketch-Based Image Retrieval (CVPR2022) [paper]
AxIoU: An Axiomatically Justified Measure for Video Moment Retrieva (CVPR2022) [paper]
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval (CVPR2022) [paper]
ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval (CVPR2022) [paper]
FashionVLP: Vision Language Transformer for Fashion Retrieval With Feedback (CVPR2022) [paper]
Sign Language Video Retrieval With Free-Form Textual Queries (CVPR2022) [paper]
UMT: Unified Multi-Modal Transformers for Joint Video Moment Retrieval and Highlight Detection (CVPR2022) [paper]
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval (CVPR2022) [paper]
ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound (ECCV2022) [paper][project][code]
VTC: Improving Video-Text Retrieval with User Comments (ECCV2022) [paper][code]
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval (ECCV2022) [paper]
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval (ECCV2022) [paper]
Adaptive Fine-Grained Sketch-Based Image Retrieval (ECCV2022) [paper]
Multi-Query Video Retrieval (ECCV2022) [paper]
Selective Query-guided Debiasing for Video Corpus Moment Retrieval (ECCV2022) [paper]
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval (ECCV2022) [paper]
Learning Linguistic Association Towards Efficient Text-Video Retrieval (ECCV2022) [paper]
Granularity-aware Adaptation for Image Retrieval over Multiple Tasks (ECCV2022) [paper]
Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval (ECCV2022) [paper]
Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment (ECCV2022) [paper]
Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval (ECCV2022) [paper]
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval (ECCV2022) [paper]
A Sketch Is Worth a Thousand Words:Image Retrieval with Text and Sketch (ECCV2022) [paper][project]

Binary Feature

Mutual Quantization for Cross-Modal Search With Noisy Labels (CVPR2022) [paper]

Real-World Scenarios

Person/Vehicle Re-Identification

Incremental/Continual Learning

Fine-Grained Retrieval

Copy/Duplicate Detection

Perceptual Image Hashing With Locality Preserving Projection for Copy Detection (TDSC2023) [paper]
Shrinking the Semantic Gap: Spatial Pooling of Local Moment Invariants for Copy-Move Forgery Detection (TIFS2023) [paper]
Efficient Hashing Method Using 2D-2D PCA for Image Copy Detection (TKDE2023) [paper]
Robust image hashing with Isomap and saliency map for copy detection (TMM2023) [paper]
A Self-Supervised Descriptor for Image Copy Detection (CVPR2022) [paper] [code]
A Robust and Fast Video Copy Detection System Using Content-Based Fingerprinting (TIFS2011) [paper]

Long-Tail Visual Recognition

Retrieval Augmented Classification for Long-Tail Visual Recognition (CVPR2022) [paper]

Classical Local Feature

Object retrieval with large vocabularies and fast spatial matching, CVPR 2007.
Visual Categorization with Bags of Keypoints, ECCV 2004.
ORB: an efficient alternative to SIFT or SURF, ICCV 2011.
Object Recognition from Local Scale-Invariant Features, ICCV 1999.
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval, ICCV 2007.
Three things everyone should know to improve object retrieval, CVPR 2012.
On-the-fly learning for visual search of large-scale image and video datasets
All about VLAD, CVPR 2013.
Aggregating localdescriptors into a compact image representation, CVPR 2010.
More About VLAD: A Leap from Euclidean to Riemannian Manifolds, CVPR 2015.
Hamming embedding and weak geometric consistency for large scale image search, CVPR 2008.
Revisiting the VLAD image representation, project
Improving the Fisher Kernel for Large-Scale Image Classification, ECCV 2010.
Image Classification with the Fisher Vector: Theory and Practice
Democratic Diffusion Aggregation for ImageRetrieval
A Vote-and-Verify Strategy for Fast Spatial Verification in Image Retrieval, ACCV 2016.
Triangulation embedding and democratic aggregation for image search, CVPR 2014.
Efficient Large-scale Image Search With a Vocabulary Tree, IPOL 2015, code.

Deep Learning Feature (Global Feature)

Online Invariance Selection for Local Feature Descriptors, ECCV 2020, code.

Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval, ECCV 2020.
SOLAR: Second-Order Loss and Attention for Image Retrieval, ECCV 2020.
Unifying Deep Local and Global Features for Image Search, arxiv 2020.
SOLAR: Second-Order Loss and Attention for Image Retrieval, arxiv 2020.
A Benchmark on Tricks for Large-scale Image Retrieval，arxiv 2020.
Learning with Average Precision: Training Image Retrieval with a Listwise Loss, ICCV 2019.
MultiGrain: a unified image embedding for classes and instances, arxiv 2019.
Deep Image Retrieval:Learning Global Representations for Image search.
End-to-end Learning of Deep Visual Representations for Image retrieval, DIR更详细的论文说明.
What Is the Best Practice for CNNs Applied to Visual Instance Retrieval?, 关于layer选取的问题.
Bags of Local Convolutional Features for Scalable Instance Search.
Faster R-CNN Features for Instance Search, CVPR workshop 2016.
Cross-dimensional Weighting for Aggregated Deep Convolutional Features, project.
Class-Weighted Convolutional Features for Image Retrieval.
Multi-Scale Orderless Pooling of Deep Convolutional Activation Features, VLAD coding.
Aggregating Deep Convolutional Features for Image Retrieval, 论文笔记, 基于深度学习的视觉实例搜索研究进展.
Particular object retrieval with integral max-pooling of CNN activations, project.
Particular object retrieval using CNN.
Learning to Match Aerial Images with Deep Attentive Architectures.
Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval.
Combining Fisher Vector and Convolutional Neural Networks for Image Retrieval, fv和cnn特征融合提升.
Selective Deep Convolutional Features for Image Retrieval, ACM MM 2017.
Class-Weighted Convolutional Features for Image Retrieval.
Fine-tuning CNN Image Retrieval with No Human Annotation, TPAMI 2018.
An accurate retrieval through R-MAC+ descriptors for landmark recognition.
Regional Attention Based Deep Feature for Image Retrieval, code, BMVC 2018.
Detect-to-Retrieve: Efficient Regional Aggregation for Image Search, CVPR 2019.
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking, project, CVPR 2018.
Guided Similarity Separation for Image Retrieval, NeurIPS 2019.

Deep Learning Feature (Local Feature)

Learning Super-Features for Image Retrieval, ICLR 2022, code.
LoFTR: Detector-Free Local Feature Matching with Transformers, CVPR 2021, code.
DFM: A Performance Baseline for Deep Feature Matching, CVPRW 2021, code.
COTR: Correspondence Transformer for Matching Across Images, arxiv 2021.
Online Invariance Selection for Local Feature Descriptors, ECCV 2020, code.
Learning and aggregating deep local descriptors for instance-level recognition, ECCV 2020, code.
DISK: Learning local features with policy gradient, NeurIPS 2020, code.
Learning and aggregating deep local descriptorsfor instance-level recognition, ECCV 2020, code.
D2D: Keypoint Extraction with Describe to Detect Approach, arxiv 2020.
UR2KiD: Unifying Retrieval, Keypoint Detection, and Keypoint Description without Local Correspondence Supervision, arxiv.
Visualizing Deep Similarity Networks, WACV 2019.
Combination of Multiple Global Descriptors for Image Retrieval.
Beyond Cartesian Representations for Local Descriptors, code, ICCV 2019.
R2D2: Reliable and Repeatable Detector and Descriptor, code, NeurIPS 2019.
SOSNet: Second Order Similarity Regularization for Local Descriptor Learning, CVPR 2019.
Local Features and Visual Words Emerge in Activations, CVPR 2019.
Explicit Spatial Encoding for Deep Local Descriptors, CVPR 2019.
Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters, ICCV 2019.
Learning Discriminative Affine Regions via Discriminability, affnet.
A Large Dataset for Improving Patch Matching, PS-Dataset.
Working hard to know your neighbor's margins: Local descriptor learning loss, code.
MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching, code.
LF-Net: Learning Local Features from Images, NeurIPS 2018.
Local Descriptors Optimized for Average Precision, CVPR 2018.
SuperPoint: Self-Supervised Interest Point Detection and Description, Magic Leap.
GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints, code, ECCV 2018.
Learning local feature descriptors with triplets and shallow convolutional neural networks, BMVC 2016.

Deep Learning Feature (Instance Search)

Deeply Activated Salient Region for Instance Search, arXiv 2020.
Instance search based on weakly supervised feature learning, Neurocomputing 2019.
Instance Search via Instance Level Segmentation and Feature Representation, arXiv 2018.
Unsupervised object discovery for instance recognition, WACV 2018.
Faster R-CNN Features for Instance Search, CVPR workshop 2016.

ANN search

Results of the NeurIPS’21 Challenge on Billion-Scale Approximate Nearest Neighbor Search.
Nearest neighbor search with compact codes: A decoder perspective, arxiv 2021.
Accelerating Large-Scale Inference with Anisotropic Vector Quantization, blog, code, ICML 2020.
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination, SIGMOD 2020.
RobustiQ A Robust ANN Search Method for Billion-scale Similarity Search on GPUs, ICMR 2019.
Zoom: Multi-View Vector Search for Optimizing Accuracy, Latency and Memory.
Vector and Line Quantization for Billion-scale Similarity Search on GPUs.
GGNN: Graph-based GPU Nearest Neighbor Search, arxiv 2019, code.
Learning to Route in Similarity Graphs, ICML 2019.
Practical and Optimal LSH for Angular Distance.
pq-fast-scan.
faiss. A library for efficient similarity search and clustering of dense vectors.
Polysemous codes.
Optimized Product Quantization.
lopq. Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
nns_benchmark. Benchmark of Nearest Neighbor Search on High Dimensional Data.
Optimized Product Quantization.
Falconn. FAst Lookups of Cosine and Other Nearest Neighbors.
Annoy. Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk.
NMSLIB. Non-Metric Space Library (NMSLIB): A similarity search library and a toolkit for evaluation of k-NN methods for generic non-metric spaces.
Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs, graph-based method.
Fast Approximate Nearest Neighbor Search With Navigating Spreading-out Graphs, code
Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition
NV-tree: A Scalable Disk-Based High-Dimensional Index.
Dynamicity and Durability in Scalable Visual Instance Search.
Revisiting the Inverted Indices for Billion-Scale Approximate Nearest Neighbors，code.
Link and code: Fast indexing with graphs and compact regression codes.
A Survey of Product Quantization，对于矢量量化方法一篇比较完整的调研，值得一读.
GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints，学习局部特征的descriptor，匹配能力较强.
Learning a Complete Image Indexing Pipeline, CVPR 2018.
spreading vectors for similarity search, ICLR 2019.
SPTAG: A library for fast approximate nearest neighbor search. Microsoft.

CBIR Attack

Open Set Adversarial Examples.

CBIR rank

Fast Spectral Ranking for Similarity Search, code, CVPR 2018.

CBIR in Industry

Videntifier is a visual search engine based on a patented large-scale local feature database, demo, based on SIFT feature and NV-tree. (Chinese blog post).
Web-Scale Responsive Visual Search at Bing.
Visual Search at Alibaba (KDD2018) [paper].
Visual Search at Pinterest.
Visual Discovery at Pinterest.
Learning a Unified Embedding for Visual Search at Pinterest, KDD 2019.
Visual Search at ebay.
Deep Learning based Large Scale Visual Recommendation and Search for E-Commerce, project.
微信「扫一扫识物」的背后技术揭秘.
揭秘微信「扫一扫」识物为什么这么快？

CBIR Competition and Challenge

Feature Fusion

Feature fusion using Canonical Correlation Analysis.

Instance Matching

Semantic Matching

End-to-end weakly-supervised semantic alignment.

Template Matching

QATM: Quality-Aware Template Matching For Deep Learning, CVPR 2019.

Image Identification

Image Identification Using SIFT Algorithm: Performance Analysis against Different Image Deformations.

Tutorials

PyRetri, Open source deep learning based image retrieval toolbox based on PyTorch.
How to Apply Distance Metric Learning to Street-to-Shop Problem.
Recent Image Search Techniques.
Compact Features for Visual Search.
multimedia-indexing. A framework for large-scale feature extraction, indexing and retrieval.
Image Similarity using Deep Ranking, code.
Triplet Loss and Online Triplet Mining in TensorFlow.
tf_retrieval_baseline.

Slide

VRG Prague in “Large-Scale Landmark Recognition Challenge”, ranked 3rd in the Google Landmark Recognition Challenge.

Demo and Demo Online

Visual Image Retrieval and Localization, SIFT feature encoded by BOW.
VGG Image Search Engine, SIFT feature encoded by BOW.
SoTu, A flask-based cbir system.
yisou, A flask-based painting cbir system, the search algorithm is designed by Yong Yuan.

Datasets

DeepFashion2 Dataset, DeepFashion2 is a comprehensive fashion dataset.
Holidays, Holidays consists images from personal holiday albums of various scene types.
Oxford, Oxford consists of 11 different Oxford landmarks.
Paris, Paris consists of images crawled from 11 queries on specific Paris architecture.
ROxford and RParis, ROxford and RParis are revisited versions of the original Oxford and Paris with annotation corrections, enlarged sizes and more difficult samples.
INSTRE, INSTRE is an instance-level object retrieval dataset.

Useful Package

VLFeat
Yael

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
README.md		README.md
logo.svg		logo.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Retrieval papers

Image Retrieval

Real-valued Feature

Binary Feature

Cross-Modal Retrieval

Real-valued Feature

Binary Feature

Real-World Scenarios

Person/Vehicle Re-Identification

Incremental/Continual Learning

Fine-Grained Retrieval

Copy/Duplicate Detection

Long-Tail Visual Recognition

Classical Local Feature

Deep Learning Feature (Global Feature)

Deep Learning Feature (Local Feature)

Deep Learning Feature (Instance Search)

ANN search

CBIR Attack

CBIR rank

CBIR in Industry

CBIR Competition and Challenge

Feature Fusion

Instance Matching

Semantic Matching

Template Matching

Image Identification

Tutorials

Slide

Demo and Demo Online

Datasets

Useful Package

Star History

About

Releases

Packages

SuQinghang/awesome-cbir-papers

Folders and files

Latest commit

History

Repository files navigation

Awesome Retrieval papers

Image Retrieval

Real-valued Feature

Binary Feature

Cross-Modal Retrieval

Real-valued Feature

Binary Feature

Real-World Scenarios

Person/Vehicle Re-Identification

Incremental/Continual Learning

Fine-Grained Retrieval

Copy/Duplicate Detection

Long-Tail Visual Recognition

Classical Local Feature

Deep Learning Feature (Global Feature)

Deep Learning Feature (Local Feature)

Deep Learning Feature (Instance Search)

ANN search

CBIR Attack

CBIR rank

CBIR in Industry

CBIR Competition and Challenge

Feature Fusion

Instance Matching

Semantic Matching

Template Matching

Image Identification

Tutorials

Slide

Demo and Demo Online

Datasets

Useful Package

Star History

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages