GitHub - XuelianCheng/SLT-Net: Implicit Motion Handling for Video Camouflaged Object Detection (CVPR 2022)

SLT-Net

This repository contains the code for our CVPR 2022 paper Implicit Motion Handling for Video Camouflaged Object Detection [CVPR 2022] [arXiv] [Project Page]

SLT-Net: we propose a new video camouflaged object detection (VCOD) framework that can use both short-term dynamics and long-term temporal consistency to detect camouflaged objects from video frames.

1. Features

Summary. This repository contains the source code, prediction results, and evaluation toolbox in eval folder.

Demo_videos. In Videos folder, we demonstrate the video results of our SLT-Net, and two top-performing baselines (including SINet, RCRNet) on MoCA-Mask test dataset.

Results. The results of all compared methods and the whole MoCA-Mask datset could be found here.

2. Proposed Framework

Figure 1: The overall pipeline of the SLT-Net. The SLT-Net consists of a short-term detection module and a long-term refinement module. The short-term detection module takes a pair of consecutive frames and predicts the camouflaged object mask for the reference frame. The long-term refinement module takes T predictions from the short-term detection module along with their corresponding referenced frames to generate the final predictions.

The training and testing experiments are conducted using PyTorch with a single NVIDIA V100 GPU of 32 GB Memory.

Note that our model also supports low memory GPU, which means you should lower the batch size.

3. Preparation

Requirements.

Python 3.9.*
CUDA 11.1
PyTorch
TorchVision

Install. Create a virtual environment and activate it.

conda create -n SLTnet python=3.8
conda activate SLTnet

The code has been tested with PyTorch 1.9 and Cuda 11.1.

conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
conda install -c conda-forge timm

Install MMCV + MMSegmentation

Follow the instructions here. MMCV and MMSegmentation are required for training the transformer encoder. A quick installation example:

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
pip install mmsegmentation

For the seq-to-seq model of long-term architecture, the core is built on CUDA OP with torchlib. Please could find more details in Github. A quick installation example:

cd ./lib/ref_video/PNS
python setup.py build develop

Dataset. To evaluate/train our SLT-Net network, you will need to download the required datasets. Noting that, If you want to use our Pseudo labels, please download via [MoCA-Mask-Pseudo].

Change the first column path in file create_link.sh with your actual dataset location. Then run create_link.sh that will create symbolic links to wherever the datasets were downloaded in the dataset folder.

├── datasets
    ├── MoCA-Mask
    ├── CAD2016
    ├── COD10K

Notting that for CAD2016 dataset, the original ground-truth maps were labelled as 1/2 index for each pixel. You need to transfer it as 0/255. We also provide the transformed new gt here at your ease.

3. Results

Prediction. You can evaluate a trained model using prediction.sh for each dataset, which would help you generate *.png images corresponding to different datasets.

sh test_video.sh
sh test_video_long_term.sh

Evaluation. Please run the file main_CAD.m or main_MoCA.m in eval folder to evaluate your model. You could also simply download the images via this Link to reach the results reported in our paper. Or download our pre-trained model via this link: snapshot. [If you download it before 7 Sep 2022, please replace it with the new version. The Net_epoch_cod10k.pth in previous snapshpt is wrong with Resnet pretrained weights.]

Acknowledgements. Please find more information about the original MoCA dataset [1] Link.

[1] Hala Lamdouar and Charig Yang and Weidi Xie and Andrew Zisserman Betrayed by Motion: Camouflaged Object Discovery via Motion Segmentation Asian Conference on Computer Vision, 2020

4. Citing

If you find this code useful, please consider to cite our work.

@inproceedings{cheng2022implicit,
  title={Implicit Motion Handling for Video Camouflaged Object Detection},
  author={Cheng, Xuelian and Xiong, Huan and Fan, Deng-Ping and Zhong, Yiran and Harandi, Mehrtash and Drummond, Tom and Ge, Zongyuan},
  booktitle={CVPR},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Demo_videos		Demo_videos
dataloaders		dataloaders
eval		eval
imgs		imgs
lib		lib
results		results
utils		utils
.gitignore		.gitignore
README.md		README.md
create_link.sh		create_link.sh
mypath.py		mypath.py
test_video.py		test_video.py
test_video.sh		test_video.sh
test_video_long_term.py		test_video_long_term.py
test_video_long_term.sh		test_video_long_term.sh
train_video.py		train_video.py
train_video.sh		train_video.sh
train_video_long_term.py		train_video_long_term.py
train_video_long_term.sh		train_video_long_term.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SLT-Net

1. Features

2. Proposed Framework

3. Preparation

3. Results

4. Citing

About

Releases

Packages

Languages

XuelianCheng/SLT-Net

Folders and files

Latest commit

History

Repository files navigation

SLT-Net

1. Features

2. Proposed Framework

3. Preparation

3. Results

4. Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages