Skip to content

[ECCV 2024] SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments

License

Notifications You must be signed in to change notification settings

fraunhoferhhi/spvloc

Repository files navigation

SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments

ECCV 2024
Fraunhofer Heinrich Hertz Institute, HHI1, Humboldt University of Berlin2

visitors

SPVLoc Overview

Our method calculates the indoor 6D camera pose by determining the image position and orientation relative to synthetic panoramas. The best panoramic match is found through semantic viewport matching.

Overview

This is the code repository for the Paper: SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments.

This repository provides:

  • Code for training and testing SPVLoc on ZinD and Structured3D dataset.
  • Checkpoints to test SPVLoc on our compiled testsets with variable FoV network, and a checkpoint to test SPVLoc on perspective images coming with S3D.
  • Scripts to convert ZinD to the Structured3D format enabling its use for training 6D pose estimation methods.
  • Scripts to reproduce our test sets (Cariable FoV, variable pitch/roll offsets) for both datasets.
  • Visualization of results.

Updates

22.07.2024: Added source code. Please don't hesitate to report compatibility problems or other issues. There might be updates in the next weeks. 🎉🎈🥳
15.07.2024: Added project page.
05.07.2024: We are happy to announce that SPVLoc has been accepted to ECCV 2024. The source code is undergoing last preparations and will be added here very soon.

Installation

Step 1: Setup Conda environment

# Create conda environment from environment.yaml file
conda env create -f environment.yaml
# Activate the environment
conda activate spvloc

Step 2: Install additional dependencies

# Build and install redner
./data/setup/install_redner.sh
# Install patched version of pyrender
./data/setup/install_pyrender.sh

Execute the following script if you want to run the code on a Linux server without a monitor.

./data/setup/prepare_mesa.sh

Step 3: Download the pretrained models

python -m spvloc.tools.download_pretrained_models

Dataset Preparation (ZiND)

Step 1: Download the ZiND dataset

  1. Follow the procedure described here to get access to the dataset.
  2. Download the dataset to /path/to/original/zind_dataset

Step 2: Convert the dataset to S3D format

Use the following command to convert the dataset to the S3D format:

python -m spvloc.tools.zind_to_s3d \
    -i /path/to/original/zind_dataset \
    -o /path/to/zind_dataset
  • To export 3D scene meshes for preview (-m). Note that this slows down the conversion.
  • For a specific scene index xy (0-1574), use -idx xy.

Step 3: Generate the testsets

python -m spvloc.tools.dataset_from_csv \
    -d data/datasets/zind/dataset_info.csv \
    -i /path/to/zind_dataset \
    -o /path/to/zind_testset

Dataset Preparation (S3D)

Step 1: Download the S3D dataset

  1. Follow the procedure described here to access the dataset.
  2. Download at least the panoramic panorama images (Structured3D_panorama_00.zip - Structured3D_panorama_17.zip) and the annotations (Structured3D_annotation_3d.zip). Also download the perspective images to train/test with the those images as done in our supplementary material.
  3. Extract the dataset to /path/to/s3d_dataset.

Step 2: Generate the testsets

python -m spvloc.tools.dataset_from_csv \
    -d data/datasets/s3d/dataset_info.csv \
    -i /path/to/s3d_dataset \
    -o /path/to/s3d_testset

Testing

Test SPVLoc on ZiND

python ./spvloc_train_test.py -c configs/config_eval_zillow.yaml \
    -t data/pretrained_models/ckpt_zind.ckpt \
    DATASET.PATH /path/to/zind_testset \
    TEST.PLOT_OUTPUT False \
    TEST.SAVE_PLOT_OUTPUT False \
    TEST.SAVE_PLOT_DETAILS False \
    POSE_REFINE.MAX_ITERS 0 \
    SYSTEM.NUM_WORKERS 0

Set POSE_REFINE.MAX_ITERS 1 to activate pose refinement.
Set SYSTEM.NUM_WORKERS to a higher value to use CPU parallelization in the dataloader.

Use DATASET.TEST_SET_FURNITURE argument to select different testsets, e.g. full_90_10_10 for 90 degree field of view and plus/minus 10 degree roll pitch variation.
The checkpoint is trained to handle angle variation in a range of plus/minus 10 degree. We will make a model trained with larger angle variation available soon.

Test SPVLoc on S3D

Use -c configs/config_eval_s3d.yaml, DATASET.PATH /path/to/s3d_testset and -t data/pretrained_models/ckpt_s3d.ckpt to test the model with S3D.

Test SPVLoc on perspective images from S3D

Use -c configs/config_eval_s3d_perspective.yaml, DATASET.PATH /path/to/s3d_dataset and -t data/pretrained_models/ckpt_s3d_16_9.ckpt to test the model with S3D perspective images. Use DATASET.TEST_SET_FURNITURE to select full or empty images.

Visualize results on ZiNd or S3D

The testing script can be used to generate visualizations of the results. An overview image (see below) will be plotted or saved for the test images.
In the top row, the 3D model is rendered with the estimated camera pose and can be compared to the model rendered with ground truth pose.

Set TEST.PLOT_OUTPUT True to plot overviews images image by image.
Set TEST.SAVE_PLOT_OUTPUT True to plot save overview images for all test images.
Set TEST.SAVE_PLOT_DETAILS True to save the parts of the overview image as separate images.

Visualization example

Training

Train SPVLoc on ZInD

python ./spvloc_train_test.py -c configs/config_zillow.yaml \
    DATASET.PATH /path/to/zind_dataset \
    TRAIN.BATCH_SIZE 10 \
    SYSTEM.NUM_WORKERS 8

Set SYSTEM.BATCH_SIZE as high as possible with your GPU. We used TRAIN.BATCH_SIZE 40 with an NVIDIA A100 GPU with 40GB.
Set SYSTEM.NUM_WORKERS to the highest possible value to maximize the benefits of CPU parallelization in the dataloader.

Train SPVLoc on S3D

Use -c configs/config_s3d.yaml and /path/to/s3d_dataset to train a model with S3D.
Add the flag DATASET.S3D_NO_PERSP_IMAGES True, in case that you have not downlaoded the full S3D dataset, including all perspective images.

Acknowledgements

This work is partially funded by the German Federal Ministry of Economic Affairs and Climate Action (BIMKIT, grant no. 01MK21001H) and the German Federal Ministry for Digital and Transport (EConoM, grant no. 19OI22009C).
We would like to thank our student assistant, Roberto Perez Martinez, for his help and support during the development of the ZinD to S3D conversion scripts.

Citation

If you find this code or our method useful for your academic research, please cite our paper

@inproceedings{Gard2024_SPVLOC,
 title        = {SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments},
 author       = {Niklas Gard and Anna Hilsmann and Peter Eisert},
 year         = 2024,
 journal      = {arXiv preprint arXiv:2404.10527}
}