Skip to content

Commit

Permalink
adds architecture for grid k520, tesla k80 gpus
Browse files Browse the repository at this point in the history
  • Loading branch information
SnShine committed May 23, 2017
1 parent 271d6c5 commit c9ace58
Showing 1 changed file with 33 additions and 25 deletions.
58 changes: 33 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,36 @@
# tf-faster-rcnn
A Tensorflow implementation of faster RCNN detection framework by Xinlei Chen (xinleic@cs.cmu.edu). This repository is based on the python Caffe implementation of faster RCNN available [here](https://github.com/rbgirshick/py-faster-rcnn).

**Note**: Several minor modifications are made when reimplementing the framework, which give potential improvements. For details about the modifications and ablative analysis, please refer to the technical report [An Implementation of Faster RCNN with Study for Region Sampling](https://arxiv.org/pdf/1702.02138.pdf). If you are seeking to reproduce the results in the original paper, please use the [official code](https://github.com/ShaoqingRen/faster_rcnn) or maybe the [semi-official code](https://github.com/rbgirshick/py-faster-rcnn). For details about the faster RCNN architecture please refer to the paper [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](http://arxiv.org/pdf/1506.01497.pdf).
**Note**: Several minor modifications are made when reimplementing the framework, which give potential improvements. For details about the modifications and ablative analysis, please refer to the technical report [An Implementation of Faster RCNN with Study for Region Sampling](https://arxiv.org/pdf/1702.02138.pdf). If you are seeking to reproduce the results in the original paper, please use the [official code](https://github.com/ShaoqingRen/faster_rcnn) or maybe the [semi-official code](https://github.com/rbgirshick/py-faster-rcnn). For details about the faster RCNN architecture please refer to the paper [Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks](http://arxiv.org/pdf/1506.01497.pdf).

### Detection Performance
The current code supports **VGG16** and **Resnet V1** models. We mainly tested it on plain VGG16 and Resnet101 (thank you @philokey!) architecture. As the baseline, we report numbers using a single model on a single convolution layer, so no multi-scale, no multi-stage bounding box regression, no skip-connection, no extra input is used. The only data augmentation technique is left-right flipping during training following the original Faster RCNN. All models are released.

With VGG16 (``conv5_3``):
- Train on VOC 2007 trainval and test on VOC 2007 test, **71.2**.
- Train on VOC 2007+2012 trainval and test on VOC 2007 test ([R-FCN](https://github.com/daijifeng001/R-FCN) schedule), **75.3**.
- Train on COCO 2014 [trainval35k](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) and test on [minival](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) (900k/1190k), **29.5**.
- Train on COCO 2014 [trainval35k](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) and test on [minival](https://github.com/rbgirshick/py-faster-rcnn/tree/master/models) (900k/1190k), **29.5**.

With Resnet101 (last ``conv4``):
- Train on VOC 2007 trainval and test on VOC 2007 test, **75.2**.
- Train on VOC 2007 trainval and test on VOC 2007 test, **75.2**.
- Train on VOC 2007+2012 trainval and test on VOC 2007 test (R-FCN schedule), **79.3**.
- Train on COCO 2014 trainval35k and test on minival (900k/1190k), **34.1**.
- Train on COCO 2014 trainval35k and test on minival (900k/1190k), **34.1**.

More Resnets:
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **31.6**.
- Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), **35.2**.
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **31.6**.
- Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), **35.2**.

Approximate [FPN](https://arxiv.org/abs/1612.03144) *baseline* [setup](https://github.com/endernewton/tf-faster-rcnn/blob/master/experiments/cfgs/res101-lg.yml):
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **33.4**.
- Train Resnet101 on COCO 2014 trainval35k and test on minival (1000k/1390k), **36.0**.
- Train Resnet152 on COCO 2014 trainval35k and test on minival (1000k/1390k), **37.2**.
- Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), **33.4**.
- Train Resnet101 on COCO 2014 trainval35k and test on minival (1000k/1390k), **36.0**.
- Train Resnet152 on COCO 2014 trainval35k and test on minival (1000k/1390k), **37.2**.

**Note**:
- Due to the randomness in GPU training with Tensorflow espeicially for VOC, the best numbers are reported (with 2-3 attempts) here. According to my experience, for COCO you can almost always get a very close number (within ~0.2%) despite the randomness.
- **All** the numbers are obtained with a different testing scheme without selecting region proposals using non-maximal suppression (TEST.MODE top), the default and original testing scheme (TEST.MODE nms) will likely result in slightly worse performance (see [report](https://arxiv.org/pdf/1702.02138.pdf), for COCO it drops 0.X AP).
- **All** the numbers are obtained with a different testing scheme without selecting region proposals using non-maximal suppression (TEST.MODE top), the default and original testing scheme (TEST.MODE nms) will likely result in slightly worse performance (see [report](https://arxiv.org/pdf/1702.02138.pdf), for COCO it drops 0.X AP).
- Since we keep the small proposals (\< 16 pixels width/height), our performance is especially good for small objects.
- For other minor modifications, please check the [report](https://arxiv.org/pdf/1702.02138.pdf). Notable ones include using ``crop_and_resize``, and excluding ground truth boxes in RoIs during training.
- For COCO, we find the performance improving with more iterations (VGG16 350k/490k: 26.9, 600k/790k: 28.3, 900k/1190k: 29.5), and potentially better performance can be achieved with even more iterations.
- For COCO, we find the performance improving with more iterations (VGG16 350k/490k: 26.9, 600k/790k: 28.3, 900k/1190k: 29.5), and potentially better performance can be achieved with even more iterations.
- For Resnets, we fix the first block (total 4) when fine-tuning the network, and only use ``crop_and_resize`` to resize the RoIs (7x7) without max-pool (which I find useless especially for COCO). The final feature maps are average-pooled for classification and regression. All batch normalization parameters are fixed. Weight decay is set to Renset101 default 1e-4. Learning rate for biases is not doubled.
- For approximate [FPN](https://arxiv.org/abs/1612.03144) baseline setup we simply resize the image with 800 pixels, add 32^2 anchors, and take 1000 proposals during testing.
- Check out [here](http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/)/[here](http://gs11655.sp.cs.cmu.edu/xinleic/tf-faster-rcnn/)/[here](https://drive.google.com/open?id=0B1_fAEgxdnvJSmF3YUlZcHFqWTQ) for the latest models, including longer COCO VGG16 models and Resnet ones.
Expand All @@ -55,10 +55,18 @@ Additional features not mentioned in the [report](https://arxiv.org/pdf/1702.021
2. Update your -arch in setup script to match your GPU
```Shell
cd tf-faster-rcnn/lib
# Check the GPU architecture
# Change the GPU architecture (-arch) if necessary
vim setup.py
```
**Note**: I only have access to TitanX (both Maxwell and Pascal) and sm_52 seems to work fine on those machines. You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs.

| GPU model | Architecture |
| ------------- | ------------- |
| TitanX (Maxwell/Pascal) | sm_52 |
| Grid K520 (AWS g2.2xlarge) | sm_30 |
| Tesla K80 (AWS p2.xlarge) | sm_37 |

**Note**: You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs.


3. Build the Cython modules
```Shell
Expand All @@ -77,7 +85,7 @@ Additional features not mentioned in the [report](https://arxiv.org/pdf/1702.021
### Setup data
Please follow the instructions of py-faster-rcnn [here](https://github.com/rbgirshick/py-faster-rcnn#beyond-the-demo-installation-for-training-and-testing-models) to setup VOC and COCO datasets (Part of COCO is done). The steps involve downloading data and optionally creating softlinks in the ``data`` folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore the steps that setup proposals.

If you find it useful, the ``data/cache`` folder created on my side is also shared [here](http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/cache.tgz).
If you find it useful, the ``data/cache`` folder created on my side is also shared [here](http://ladoga.graphics.cs.cmu.edu/xinleic/tf-faster-rcnn/cache.tgz).

### Demo and Test with pre-trained models
1. Download pre-trained model
Expand All @@ -103,16 +111,16 @@ If you find it useful, the ``data/cache`` folder created on my side is also shar
```Shell
# at reposistory root
GPU_ID=0
CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py
CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py
```
**Note**: Resnet101 testing probably requires several gigabytes of memory, so if you encounter memory capacity issues, please install it with CPU support only. Refer to [Issue 25](https://github.com/endernewton/tf-faster-rcnn/issues/25).
**Note**: Resnet101 testing probably requires several gigabytes of memory, so if you encounter memory capacity issues, please install it with CPU support only. Refer to [Issue 25](https://github.com/endernewton/tf-faster-rcnn/issues/25).

4. Test with pre-trained Resnet101 models
```Shell
GPU_ID=0
./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc_0712 res101
```
**Note**: If you cannot get the reported numbers (78.7 on my side), then probabaly the NMS function is compiled improperly, refer to [Issue 5](https://github.com/endernewton/tf-faster-rcnn/issues/5).
**Note**: If you cannot get the reported numbers (78.7 on my side), then probabaly the NMS function is compiled improperly, refer to [Issue 5](https://github.com/endernewton/tf-faster-rcnn/issues/5).

### Train your own model
1. Download pre-trained models and weights. The current code support VGG16 and Resnet V1 models. Pre-trained models are provided by slim, you can get the pre-trained models [here](https://github.com/tensorflow/models/tree/master/slim#pre-trained-models) and set them in the ``data/imagenet_weights`` folder. For example for VGG16 model, you can set up like:
Expand Down Expand Up @@ -151,7 +159,7 @@ If you find it useful, the ``data/cache`` folder created on my side is also shar
tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &
tensorboard --logdir=tensorboard/vgg16/coco_2014_train+coco_2014_valminusminival/ --port=7002 &
```

4. Test and evaluate
```Shell
./experiments/scripts/test_faster_rcnn.sh [GPU_ID] [DATASET] [NET]
Expand All @@ -162,7 +170,7 @@ If you find it useful, the ``data/cache`` folder created on my side is also shar
./experiments/scripts/test_faster_rcnn.sh 0 pascal_voc vgg16
./experiments/scripts/test_faster_rcnn.sh 1 coco res101
```

5. You can use ``tools/reval.sh`` for re-evaluation


Expand Down Expand Up @@ -196,7 +204,7 @@ If you find this implementation or the analysis conducted in our report helpful,
Journal = {arXiv preprint arXiv:1702.02138},
Year = {2017}
}

For convenience, here is the faster RCNN citation:

@inproceedings{renNIPS15fasterrcnn,
Expand Down Expand Up @@ -226,7 +234,7 @@ VGG16 COCO 2015 test-dev (900k/1190k):
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.451
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.591
```

VGG16 COCO 2015 test-std (900k/1190k):
```
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.295
Expand Down

0 comments on commit c9ace58

Please sign in to comment.