Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization
The repository is for the paper“Zero-Shot Sketch-Based Remote-Sensing Image Retrieval Based on Multi-Level and Attention-Guided Tokenization”. In this repository, you can find the official PyTorch implementation of multi-level and attention-guided tokenization network
Python 3.7 pytorch 1.11.0 torchvision 0.12.0 einops 0.6.1
We provides access to download the RSketch_Ext dataset from Baidu web disk You are free to divide the training set and the test set as you wish. (Access Password:xpmv)
The pre-trained ViT model on ImageNet-1K is provided on Baidu Web disk
You should place sam_ViT-B_16.pth
in ./model
and modify line 195 in ./model/self_attention.py
to absolute path if necessary. (Access Password:t6p1)
# dataset train_path # path to load train data. test_path # path to load test data. # model d_model # feature dimension. d_ff # fead-forward layer dimension. head # number of cross_attention encoder head. number # number of cross_attention encoder layer. pretrained # whether to use pretrained ViT model. # train save # model save path. batch # batch size. epoch # train epoch. datasetLen # the amount of data training in a single batch. learning_rate # learning rate. weight_decay # weight_decay. # test load # model load path. test_sk # testset number of incoming sketches in a single batch. test_im # testset number of incoming remote sensing image in a single batch. num_workers # dataloader num workers. database_path # preinfer remote sensing image database load path. amount # visualize the number of remote sensing images returned. result_path # accuracy evaluation result saving path. # other choose_cuda # cuda to use. seed # random seed.Thank you and sorry for the bugs! * Bo Yang
* Chen Wang
* Xiaoshuang Ma
* Beiping Song
* Zhuang Liu
* Fangde Sun If you think this work is interesting, please cite:
@Article{Cross-modal-retrieval-MLAGT, title={Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization}, author={Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu and Fangde Sun}, year={2024}, journal={Remote Sensing}, volume={16}, number={10}, pages={1653}, doi={https://doi.org/10.3390/rs16101653} }