Skip to content

wzheng1983/tf-faster-rcnn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tf-faster-rcnn

A Tensorflow implementation of faster RCNN detection framework by Xinlei Chen (xinleic@cs.cmu.edu). This repository is based on the python Caffe implementation of faster RCNN available here. For details about the faster RCNN architecture please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.

Note: Several modifications are made when reimplementing the framework, which gives potential improvements. For details about the modifications and ablative analysis, please refer to the technical report An Implementation of Faster RCNN with Study for Region Sampling. If you are seeking to reproduce the results in the original paper, please use the official code and semi-official code.

Detection Performance

We only tested it on VGG16 architecture so far. Our best performance as of January 2017:

  • Train on VOC 2017 trainval and test on VOC 2017 test, 71.2.
  • Train on COCO 2014 trainval-minival and test on minival (longer), 28.3.

Note that the above numbers are obtained with a different testing scheme, the original testing scheme will result in slightly worse performance (see report). Since we keep the small proposals (< 16pixels), our performance is especially good for small objects.

Additional Features

Additional features are added to make research life easier:

  • Support for train and validation. During training, the validation data will also be tested from time to time to monitor the process and check potential overfitting. Ideally training and validation should be separate, where the model is loaded everytime to test on validation. However I have implemented it in a joint way to save time and GPU memory.
  • Support for stop and retrain. I tried to store as much information as possible when snapshoting, with the purpose to resume training from the lateset snapshot properly. The meta information includes current image index, permutation of images, and random state of numpy. However, when you resume training the random seed for tensorflow will be reset (not sure how to save the random state of tensorflow now), so it will result in a difference. Note that, the current implementation still cannot force the model to behave deterministically even with the random seed set. Suggestion/solution is welcome and much appreciated.
  • Support for visualization. The current implementation will summarize statistics of losses, activations and variables during training, and dump it to a separate folder for tensorboard visualization. The computing graph is also saved for debugging.

Prerequisites

  • A basic Tensorflow installation. r0.12 is fully tested. r0.10+ should in general be fine. For experimenting the original RoI pooling (which requires modification of the C++ code in tensorflow), you can check out my tensorflow fork.
  • Python packages you might not have: cython, python-opencv, easydict (similar to py-faster-rcnn).

Installation

  1. Clone the repository
git clone https://github.com/endernewton/tf-faster-rcnn.git
  1. Build the Cython modules
cd tf-faster-rcnn/lib
make
  1. Download pre-trained models and weights
# return to the repository root
cd ..
# model for both voc and coco using default training scheme
./data/scripts/fetch_faster_rcnn_models.sh
# model for coco using longer training scheme (600k/790k)
./data/scripts/fetch_coco_long_models.sh
# weights for imagenet pretrained model, extracted from released caffe model
./data/scripts/fetch_imagenet_weights.sh

Right now the imagenet weights are used to initialize layers for both training and testing to build the graph, despite that for testing it will later restore trained tensorflow models. This step can be removed in a similified version.

Setup data

Please follow the instructions of py-faster-rcnn here to setup VOC and COCO datasets, which involves downloading data and creating softlinks in the data folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore those steps.

If you find it useful, the data/cache folder created on my side is also shared here.

Testing

  1. Create a folder and a softlink to use the pretrained model
mkdir -p output/vgg16/
ln -s data/faster_rcnn_models/voc_2007_trainval/ output/vgg16/
ln -s data/faster_rcnn_models/coco_2014_train+coco_2014_valminusminival/ output/vgg16/
  1. Test
GPU_ID=0
./experiments/scripts/test_vgg16.sh $GPU_ID pascal_voc
./experiments/scripts/test_vgg16.sh $GPU_ID coco

It generally needs several GBs to test the pretrained model.

Training

  1. (Optional) If you have just tested the model, first remove the link to the pretrained model
rm -v output/vgg16/voc_2007_trainval
rm -v output/vgg16/coco_2014_train+coco_2014_valminusminival
  1. Train
GPU_ID=0
./experiments/scripts/vgg16.sh $GPU_ID pascal_voc
./experiments/scripts/vgg16.sh $GPU_ID coco
  1. Visualization with Tensorboard
tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &
tensorboard --logdir=tensorboard/vgg16/coco_2014_train+coco_2014_valminusminival/ --port=7002 &

The default number of training iteratsions are kept the same to the original faster RCNN, however it is beneficial to train longer for COCO (see report). Also note that due to the nondeterministic nature of the current implementation, the performance can vary, but in general it should be within 1% of the reported numbers.

Citation

If you find this implementation or the analysis conducted in our report helpful, please consider citing:

@article{chen17implementation,
    Author = {Xinlei Chen and Abhinav Gupta},
    Title = {An Implementation of Faster RCNN with Study for Region Sampling},
    Journal = {arXiv preprint arXiv:},
    Year = {2017}
}

For convenience, here is the faster RCNN citation:

@inproceedings{renNIPS15fasterrcnn,
    Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
    Title = {Faster {R-CNN}: Towards Real-Time Object Detection
             with Region Proposal Networks},
    Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
    Year = {2015}
}

About

A Tensorflow Implementation of Faster RCNN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 91.9%
  • Shell 3.3%
  • Cuda 2.9%
  • MATLAB 1.0%
  • Roff 0.7%
  • C++ 0.1%
  • Makefile 0.1%