This is the official implementation of iFusion, a framework that extends existing single-view reconstruction to pose-free sparse-view reconstruction by repurposing Zero123 for camera pose estimation.
iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views
Chin-Hsuan Wu,
Yen-Chun Chen,
Bolivar Solarte,
Lu Yuan,
Min Sun
git clone https://github.com/chinhsuanwu/ifusion.git
cd ifusion
# conda environment.yaml is also available
pip install -r requirements.txt
Download Zero123-XL to ldm/ckpt
wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt -P ldm/ckpt
Run demo by specifying the image directory containing 2+ images
python demo.py data.image_dir=asset/sorter
The output includes a NeRF-style transform.json
file (from camera pose estimation), lora.ckpt
(from fine-tuning), and demo.png
(from fine-tuning, as shown below), all located in the given directory.
One can also run a quick ablation without including our method, i.e., the original single-view Zero123, for comparison
python demo.py data.image_dir=asset/sorter \
data.demo_fp=asset/sorter/demo_single_view.png \
inference.use_single_view=true
For 3D reconstruction, please check out ifusion-threestudio.
# download the renderings for GSO and OO3D
bash download_data.sh
# camera pose estimation
python main.py --pose \
--gpu_ids=0,1,2,3 \
data.root_dir=rendering \
data.name=GSO \
data.exp_root_dir=exp
# novel view synthesis
python main.py --nvs \
--gpu_ids=0,1,2,3 \
data.root_dir=rendering \
data.name=GSO \
data.exp_root_dir=exp
# evaluation
python eval.py --pose
python eval.py --nvs
Please refer to config/main.yaml
for detailed hyper-parameters and arguments.
@article{wu2023ifusion,
author = {Wu, Chin-Hsuan and Chen, Yen-Chun, Solarte, Bolivar and Yuan, Lu and Sun, Min},
title = {iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views},
journal = {arXiv preprint arXiv:2312.17250},
year = {2023}
}
This repo is a wild mixture of zero123, threestudio, and lora. Kudos to the authors for their amazing work!