GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Project Page | Paper | Hugging Face

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Xiao Fu*, Wei Yin*, Mu Hu*, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin† , Xiaoxiao Long†

Equal contribution; † Corresponding authors
Arxiv Preprint, 2024

🛠️ Setup

We test our codes under the following environment: Ubuntu 22.04, Python 3.9.18, CUDA 11.8.

Clone this repository.

git clone git@github.com:fuxiao0719/GeoWizard.git
cd GeoWizard

Install packages

conda create -n geowizard python=3.9
conda activate geowizard
pip install -r requirements.txt
cd geowizard

🤖 Usage

Run inference for depth & normal

Place your images in a directory input/example (for example, where we have prepared several cases), and run the following inference. The depth and normal outputs will be stored in output/example.

python run_infer.py \
    --input_dir ${input path} \
    --output_dir ${output path} \
    --ensemble_size ${ensemble size} \
    --denoise_steps ${denoising steps} \
    --domain ${data type}
# e.g.
python run_infer.py \
    --input_dir input/example \
    --output_dir output \
    --ensemble_size 3 \
    --denoise_steps 10 \
    --domain "indoor"

Inference settings: --domain: Data type. Options: "indoor", "outdoor", and "object". Note that "object" is best for background-free objects, like that in objaverse. We find that "indoor" will suit in most scenarios. Default: "indoor". --ensemble_size and --denoise_steps: trade-off arguments for speed and performance, more ensembles and denoising steps to get higher accuracy. Default: 3 and 10.

📝 TODO List

Add inference code for 3D reconstruction.
Add training codes.
Test on more different local environments.

📚 Related Work

We also encourage readers to follow these concurrent exciting works.

Marigold: a finetuned diffusion model for estimating monocular depth.
Wonder3D: generate multi-view normal maps and color images and reconstruct high-fidelity textured mesh.
HyperHuman: a latent structural diffusion and a structure-guided refiner for high-resolution human generation.
GenPercept: a finetuned UNet for a lot of downstream image understanding tasks.

🔗 Citation

@article{fu2024geowizard,
  title={GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image},
  author={Fu, Xiao and Yin, Wei and Hu, Mu and Wang, Kaixuan and Ma, Yuexin and Tan, Ping and Shen, Shaojie and Lin, Dahua and Long, Xiaoxiao},
  journal={arXiv preprint arXiv:2403.12013},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
assets		assets
geowizard		geowizard
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Project Page | Paper | Hugging Face

🛠️ Setup

🤖 Usage

Run inference for depth & normal

📝 TODO List

📚 Related Work

🔗 Citation

About

Releases 1

Packages

Languages

sdbds/GeoWizard

Folders and files

Latest commit

History

Repository files navigation

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Project Page | Paper | Hugging Face

🛠️ Setup

🤖 Usage

Run inference for depth & normal

📝 TODO List

📚 Related Work

🔗 Citation

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages