Skip to content

ECCV 2022 Workshop: A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts

Notifications You must be signed in to change notification settings

wujiekd/ROBIN-dataset

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 

Repository files navigation

OOD-CV dataset

Dataset download

The updated dataset can be accessed from here, the folder processed_dataset contains the dataset processed by the converter.py in this repo, you can process the raw dataset (ROBINv1.1.zip) in anyway you want, but the name listed in the csv file are the entry we will use to evaluate on the CodaLab server. Note that we include a new nuisance in our dataset, occlusion.

Phase-2 dataset

The dataset used by Phase-2 can be accessed from here, an email describing the rules for Phase-2 and final prize consideration will be send out after the phase-2 begins.

Rules about phase-1 data

  1. The phase-1 data can be used as a validation set, but the labels cannot be used for training.
  2. The phase-1 data also can not be used as an unlabeled set for training for phase-2 submissions

Evaluation

Our aim is to measure model robustness w.r.t. OOD nuisance factors. Therefore, the final benchmark scoring is data and accuracy constrained. This means, that to be valid a submission must:

  1. Only use the training data that we provide. Using outside data is not allowed.
  2. The model’s accuracy on the I.I.D. test set must be lower than a pre-defined threshold (which is defined by the performance of a baseline model). The final benchmark score is then measured as average performance on the held-out O.O.D. test set.

The I.I.D. accuracy thresholds are as follows: Image Classification = 91.1 [top-1 accuracy] Object Detection = 79.9 [mAP@50] Pose Estimation = 68.7 [Acc@pi/6] Each accuracy threshold was determined by training the baseline models five times, followed by computing the mean performance and adding three standard deviations.

The evaluation code used on the CodaLab server is provided in the evaluation folder.

CodaLab Servers

Tasks CodaLab
Image-Classification https://codalab.lisn.upsaclay.fr/competitions/6781
Object-Detection https://codalab.lisn.upsaclay.fr/competitions/6784
3D-Pose-Estimation https://codalab.lisn.upsaclay.fr/competitions/6783

Changes

  1. The Phase-1 of the competition will not be a code submission challenge, we have released all the test data and labels in this repo. And Phase-1 will last longer than original planed, we will ask each team to provide a description of their developing environment at the end of Phase-1, Phase-2 will still be code submission challenge.
  2. We will be using Top-1 accuracy for image-classification, mAP@50 for object detection, and Acc@pi/6 for pose estimation as the metric, the IID test performance will also be considered as per request of the sponsor, we will penalize submissions that are significantly different in IID performance with our baseline.
  3. The only limitation now is that the model should only be trained on the given training set and/or the ImageNet-1k dataset, no additional dataset is allowed. You can use any ensemble, data augmentation, or test-time training techniques.

This is the official repository for the OOD-CV dataset.

The .csv file in each folder of the zip file contains the bounding box and 3D pose annotations for each images, please refer to this repo to see how we convert the matlab annotations into these csv files. Please note that for nuisance, there is no images for some particular categories, e.g. there is no diningtables in an OOD weather.

For image classification, you will need to crop the bounding boxes with 10 pixel padding from the images.

For object detection, we recommand using the pascal-voc-writer library to convert the .csv files into PASCAL-VOC format for training and testing.

For 3D pose estimation, we recommand using the NeMo pipeline to train and test the model.

We will also provide the data processing scripts, baselines, and add the occlusion nuisance shortly.

Q&A

Q: Is it allowed to use additional data for training purposes?

A: No, for the data, only the images and the classification labels from ImageNet-1k can be used. For pretrained models, only ImageNet-1k pretrained models can be used for the challenge.

Q: Can we use strong data augmentations (e.g. GANs), ensembling, or test time training techniques?

A: Yes, but please note that we penalize submissions that are too far away on the IID test performance, and most of these techniques also improve the IID performance, so use them with caution.

About

ECCV 2022 Workshop: A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Dockerfile 0.9%