Skip to content

[arXiv'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Notifications You must be signed in to change notification settings

sdbds/GeoWizard

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image


GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Xiao Fu*, Wei Yin*, Mu Hu*, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin† , Xiaoxiao Long†

  • Equal contribution; † Corresponding authors
    Arxiv Preprint, 2024

demo_vid

🛠️ Setup

We test our codes under the following environment: Ubuntu 22.04, Python 3.9.18, CUDA 11.8.

  1. Clone this repository.
git clone git@github.com:fuxiao0719/GeoWizard.git
cd GeoWizard
  1. Install packages
conda create -n geowizard python=3.9
conda activate geowizard
pip install -r requirements.txt
cd geowizard

🤖 Usage

Run inference for depth & normal

Place your images in a directory input/example (for example, where we have prepared several cases), and run the following inference. The depth and normal outputs will be stored in output/example.

python run_infer.py \
    --input_dir ${input path} \
    --output_dir ${output path} \
    --ensemble_size ${ensemble size} \
    --denoise_steps ${denoising steps} \
    --domain ${data type}
# e.g.
python run_infer.py \
    --input_dir input/example \
    --output_dir output \
    --ensemble_size 3 \
    --denoise_steps 10 \
    --domain "indoor"

Inference settings: --domain: Data type. Options: "indoor", "outdoor", and "object". Note that "object" is best for background-free objects, like that in objaverse. We find that "indoor" will suit in most scenarios. Default: "indoor". --ensemble_size and --denoise_steps: trade-off arguments for speed and performance, more ensembles and denoising steps to get higher accuracy. Default: 3 and 10.

📝 TODO List

  • Add inference code for 3D reconstruction.
  • Add training codes.
  • Test on more different local environments.

📚 Related Work

We also encourage readers to follow these concurrent exciting works.

  • Marigold: a finetuned diffusion model for estimating monocular depth.
  • Wonder3D: generate multi-view normal maps and color images and reconstruct high-fidelity textured mesh.
  • HyperHuman: a latent structural diffusion and a structure-guided refiner for high-resolution human generation.
  • GenPercept: a finetuned UNet for a lot of downstream image understanding tasks.

🔗 Citation

@article{fu2024geowizard,
  title={GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image},
  author={Fu, Xiao and Yin, Wei and Hu, Mu and Wang, Kaixuan and Ma, Yuexin and Tan, Ping and Shen, Shaojie and Lin, Dahua and Long, Xiaoxiao},
  journal={arXiv preprint arXiv:2403.12013},
  year={2024}
}

About

[arXiv'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%