Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction'
- Create a conda environment:
conda create --name splatter-image
conda activate splatter-image
Install Pytorch following official instructions. Pytorch / Python / Pytorch3D combination that was verified to work is:
- Python 3.8, Pytorch 1.13.0, CUDA 11.6, Pytorch3D 0.7.2 Alternatively, you can create a separate environment with Pytorch3D 0.7.2, which you use just for CO3D data preprocessing. Then, once CO3D had been preprocessed, you can use these combinations of Python / Pytorch too.
- Python 3.7, Pytorch 1.12.1, CUDA 11.6
- Python 3.8, Pytorch 2.1.1, CUDA 12.1
Install other requirements:
pip install -r requirements.txt
-
Install Gaussian Splatting renderer, i.e. the library for rendering a Gaussian Point cloud to an image. To do so, pull the Gaussian Splatting repository and, with your conda environment activated, run
pip install submodules/diff-gaussian-rasterization
. You will need to meet the hardware and software requirements. We did all our experimentation on an NVIDIA A6000 GPU and speed measurements on an NVIDIA V100 GPU. -
If you want to train on CO3D data you will need to install Pytorch3D 0.7.2. See instructions here. It is recommended to install with pip from a pre-built binary. Find a compatible binary here and install it with
pip
. For example, with Python 3.8, Pytorch 1.13.0, CUDA 11.6 runpip install --no-index --no-cache-dir pytorch3d -f https://anaconda.org/pytorch3d/pytorch3d/0.7.2/download/linux-64/pytorch3d-0.7.2-py38_cu116_pyt1130.tar.bz2
.
-
For training / evaluating on ShapeNet-SRN classes (cars, chairs) please download the srn_*.zip (* = cars or chairs) from PixelNeRF data folder. Unzip the data file and change
SHAPENET_DATASET_ROOT
inscene/srn.py
to the parent folder of the unzipped folder. For example, if your folder structure is:/home/user/SRN/srn_cars/cars_train
, inscene/srn.py
setSHAPENET_DATASET_ROOT="/home/user/SRN"
. No additional preprocessing is needed. -
For training / evaluating on CO3D download the hydrant and teddybear classes from the CO3D release. To do so, run the following commands:
git clone https://github.com/facebookresearch/co3d.git
cd co3d
mkdir DOWNLOAD_FOLDER
python ./co3d/download_dataset.py --download_folder DOWNLOAD_FOLDER --download_categories hydrant,teddybear
Next, set CO3D_RAW_ROOT
to your DOWNLOAD_FOLDER
in data_preprocessing/preoprocess_co3d.py
. Set CO3D_OUT_ROOT
to where you want to store preprocessed data. Run python -m data_preprocessing.preprocess_co3d
and set CO3D_DATASET_ROOT:=CO3D_OUT_ROOT
.
Pretrained ShapeNet-SRN models are now available here. Download the checkpoint together with its config folder. The folders also include test scores for reference.
CO3D models are not yet publicly available but we are working to release them as soon as possible.
Single-view models can be trained with the following command:
python train_network.py +dataset=[cars,chairs,hydrants,teddybears]
To train a 2-view model run:
python train_network.py +dataset=cars cam_embd=pose_pos data.input_images=2 opt.imgs_per_obj=5
Once a model is trained evaluation can be run with
python eval.py $model_parent_folder
$model_parent_folder
should hold a model_latest.pth
file and a .hydra
folder with config.yaml
inside it.
To save renders modify variable save_vis
and out_folder
in eval.py.
Training loop is implemented in train_network.py
and evaluation code is in eval.py
. Datasets are implemented in scene/srn.py
and scene/co3d.py
. Model is implemented in scene/gaussian_predictor.py
. The call to renderer can be found in gaussian_renderer/__init__.py
.
Gaussian rasterizer assumes row-major order of rigid body transform matrices, i.e. that position vectors are row vectors. It also requires cameras in the COLMAP / OpenCV convention, i.e., that x points right, y down, and z away from the camera (forward).
@inproceedings{szymanowicz23splatter,
title={Splatter Image: Ultra-Fast Single-View 3D Reconstruction},
author={Stanislaw Szymanowicz and Christian Rupprecht and Andrea Vedaldi},
year={2023},
booktitle={arXiv},
}
S. Szymanowicz is supported by an EPSRC Doctoral Training Partnerships Scholarship (DTP) EP/R513295/1 and the Oxford-Ashton Scholarship. A. Vedaldi is supported by ERC-CoG UNION 101001212. We thank Eldar Insafutdinov for his help with installation requirements.