WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

demo.mp4

Introduction

This repository is the official Pytorch implementation of WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion. For more information, please visit our project page.

Installation

Please see Installation for details.

Quick Demo

Registration

To download SMPL body models (Neutral, Female, and Male), you need to register for SMPL and SMPLify. The username and password for both homepages will be used while fetching the demo data.

Next, run the following script to fetch demo data. This script will download all the required dependencies including trained models and demo videos.

bash fetch_demo_data.sh

You can try with one examplar video:

python demo.py --video examples/IMG_9732.mov --visualize

We assume camera focal length following CLIFF. You can specify known camera intrinsics [fx fy cx cy] for SLAM as the demo example below:

python demo.py --video examples/drone_video.mp4 --calib examples/drone_calib.txt --visualize

Dataset

Please see Dataset for details.

Evaluation

# Evaluate on 3DPW dataset
python -m lib.eval.evaluate_3dpw --cfg configs/yamls/demo.yaml TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar

# Evaluate on RICH dataset
python -m lib.eval.evaluate_rich --cfg configs/yamls/demo.yaml TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar

# Evaluate on EMDB dataset (also computes W-MPJPE and WA-MPJPE)
python -m lib.eval.evaluate_emdb --cfg configs/yamls/demo.yaml --eval-split 1 TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar   # EMDB 1

python -m lib.eval.evaluate_emdb --cfg configs/yamls/demo.yaml --eval-split 2 TRAIN.CHECKPOINT checkpoints/wham_vit_w_3dpw.pth.tar   # EMDB 2

Training

Will be updated.

Acknowledgement

We would like to sincerely appreciate Hongwei Yi and Silvia Zuffi for the discussion and proofreading. Part of this work was done when Soyong Shin was an intern at the Max Planck Institute for Intelligence System.

The base implementation is largely borrowed from VIBE and TCMR. We use ViTPose for 2D keypoints detection and DPVO, DROID-SLAM for extracting camera motion. Please visit their official websites for more details.

TODO

Training implementation
Colab / Hugging face release
Demo for custom videos

Citation

@article{shin2023wham,
    title={WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion},
    author={Shin, Soyong and Kim, Juyong and Halilaj, Eni and Black, Michael J.},
    journal={arXiv preprint 2312.07531},
    year={2023}}

License

Please see License for details.

Contact

Please contact soyongs@andrew.cmu.edu for any questions related to this work.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
configs		configs
docs		docs
lib		lib
third-party		third-party
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
fetch_demo_data.sh		fetch_demo_data.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

Introduction

Installation

Quick Demo

Registration

Dataset

Evaluation

Training

Acknowledgement

TODO

Citation

License

Contact

About

Releases

Packages

Languages

License

RohaanA/WHAM

Folders and files

Latest commit

History

Repository files navigation

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

Introduction

Installation

Quick Demo

Registration

Dataset

Evaluation

Training

Acknowledgement

TODO

Citation

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages