This repository contains the dataset and a description of the data and labels used in Quantifying the LiDAR Sim-to-Real Domain Shift: A Detailed Investigation Using Object Detectors and Analyzing Point Clouds at Target-Level. The dataset includes 12,000 labeled point clouds in total, whereas 6,000 are captured during the Indy Autonomous Challenge in Las Vegas in 2022. The other subset of 6,000 samples is generated in simulation and includes the same scenarios, objects, and environment as the real counterpart. Each point cloud file (.pcd) contains the fused point clouds of three LiDAR sensors, covering 360° horizontally in total. The labels for each point cloud (.txt) are in the same format as the labels of the KITTI dataset.
As this dataset is distribution-aligned, i.e., every real point cloud has a scenario-identical simulated counterpart with the same indx, this dataset can be used to study the domain shift or evaluate the performance of domain adaptation algorithms.
Examples of a real (red) and sim (blue) point cloud showing the same scenario from our dataset.
Please follow this link to download the dataset (~12GB).
The dataset contains two main folders, real and sim, that are equally structured. Each contains a subdirectory data and ImageSets. data contains 6,000 point clouds in the .pcd format and 6,000 labels with the corresponding index in the .txt format similar to the KITTI label format. ImageSet contains three .txt files, train.txt, val.txt, and test.txt, listing the indices of the point clouds used for training, validation, and testing. Our split is 4000, 1000, 1000 for training, validation, and testing, respectively.
Sim2RealDistributionAlignedDataset
├── real
│ ├── data
│ │ │── pcl
│ │ │ │── 000000.pcd
│ │ │ │── ...
│ │ │ │── 029995.pcd
│ │ │── label
│ │ │ │── 000000.txt
│ │ │ │── ...
│ │ │ │── 029995.txt
│ ├── ImageSets
│ │ │── train.txt
│ │ │── val.txt
│ │ │── test.txt
├── sim
│ ├── data
│ │ │── pcl
│ │ │ │── 000000.pcd
│ │ │ │── ...
│ │ │ │── 029995.pcd
│ │ │── label
│ │ │ │── 000000.txt
│ │ │ │── ...
│ │ │ │── 029995.txt
│ ├── ImageSets
│ │ │── train.txt
│ │ │── val.txt
│ │ │── test.txt
Please check our paper for a detailed data description. The following provides a brief summary of the real and sim data.
The real dataset was captured during the Indy Autonomous Challenge in Las Vegas in 2022. The vehicle used for data generation was an autonomous AV-21 equipped with three LiDAR sensors, each covering 120° horizontally to cover 360° in total. The .pcd files include the fused point clouds. Labeling was done semi-automatically using the GPS positions of the ego-vehicle and the other vehicles on track. The positions were refined using the point cloud distribution in the proximity of the initially placed 3D bounding boxes.
The sim dataset is distribution-aligned, i.e., scenario-identical, with the real dataset. It was created using Unity and a custom LiDAR sensor model. The environment models the same racetrack as in the real data. The scenarios extracted from the real dataset were replayed in this simulation environment and point clouds were captured using the custom LiDAR sensor model. The labels were generated automatically in Unity.
Real (left) and sim (right) AV21 used for dataset generation
If you find our work useful in your research, please consider citing:
@ARTICLE{Huch23DomainShift,
author={Huch, Sebastian and Scalerandi, Luca and Rivera, Esteban and Lienkamp, Markus},
journal={IEEE Transactions on Intelligent Vehicles},
title={Quantifying the LiDAR Sim-to-Real Domain Shift: A Detailed Investigation Using Object Detectors and Analyzing Point Clouds at Target-Level},
year={2023},
volume={},
number={},
pages={1-14},
doi={10.1109/TIV.2023.3251650}}
@misc{Huch_S2R_DAD_2023,
author = {Huch, Sebastian and Scalerandi, Luca and Rivera, Esteban and Lienkamp, Markus},
title = {S2R-DAD: Sim-to-Real Distribution-Aligned Dataset},
publisher = {Technical University of Munich},
url = {https://mediatum.ub.tum.de/1695833},
type = {Dataset},
year = {2023},
doi = {10.14459/2023mp1695833},
keywords = {Sim-to-Real; LiDAR; Point Cloud; Domain Shift; Domain Adaptation},
language = {en},
}