3D Gaussian Splatting

A "from scratch" re-implementation of 3D Gaussian Splatting for Real-Time Radiance Field Rendering by Kerbl and Kopanas et al.

This repository implements the forward and backwards passes using a PyTorch CUDA extension based on the algorithms descriped in the paper. Some details of the splatting and adaptive control algorithm are not explicitly described in the paper and there may be differences between this repo and the official implementation.

The forward and backward pass algorithms are detailed in MATH.md

Performance

Evaluations done with the Mip-NeRF 360 dataset at ~1 megapixel resoloution. This corresponds to the 2x downsampled indoor scenes and 4x downsampled outdoor scenes. Every 8th image was used for the test split.

Here are some comparisons with the with the official implementation (copied from "Per-Scene Error Metrics").

Method	Dataset	PSNR	SSIM	N Gaussians	Train Duration*
Official-30k	Garden 1/4x	27.41	0.87		~35-45min (A6000)
Ours-30k	Garden 1/4x	26.86	0.85	2.78M	~21min (RTX4090)
Official-7k	Garden 1/4x	26.24	0.83
Ours-7k	Garden 1/4x	25.80	0.80	1.61M	~3min (RTX4090)
Official-30k	Counter 1/2x	28.70	0.91
Ours-30k	Counter 1/2x	28.60	0.90	2.01M	~26min (RTX4090)
Official-7k	Counter 1/2x	26.70	0.87
Ours-7k	Counter 1/2x	27.42	0.89	1.40M	~5min (RTX4090)
Official-30k	Bonsai 1/2x	31.98	0.94
Ours-30k	Bonsai 1/2x	31.45	0.94	0.84M	~18min (RTX4090)
Official-7k	Bonsai 1/2x	28.85	0.91
Ours-7k	Bonsai 1/2x	29.98	0.93	1.16M	~4min (RTX4090)
Official-30k	Room 1/2x	30.63	0.91
Ours-30k	Room 1/2x	31.52	0.92	1.84M	~21min (RTX4090)
Official-7k	Room 1/2x	28.14	0.88
Ours-7k	Room 1/2x	29.13	0.90	1.01M	~3min (RTX4090)

*The training time is not directly comparable between the different GPUs. The RTX4090 is faster than the A6000. The training speed between the two methods should be similar.

A comparison from one of the test images in the garden dataset. The official implementation image appears to be more saturated since the image is extracted from the published pdf. The branch in the exploded view and the wall is reconstructed more crisply in our implementation but the official implementation performs better on the trees and bushes.

The gradient computation kernels are currently templated to enable float64 tensors which are required to use torch.autograd.gradcheck. All of the backward passes have gradcheck unit test coverage and should be computing the correct gradients for the corresponding forward pass. Additionally, the templated kernels do not allow for float2/3/4 types which could improve performance with better memory alignment.

The discrepancy in PSNR are most likely due to differences in the adaptive control algorithm and tuning.

Installation

This package requires CUDA which can be installed from here.

Install Python dependencies

pip install -r requirements.txt

Install the PyTorch CUDA extension

pip install -e ./

Note:

Windows systems may need modify compilation flags in setup.py
This step may be sensitive to the version of pip. This step failed after upgrading from 23.0.1 to 23.3.2
If pip install fails, this may work:

python setup.py build_ext && python setup.py install

Optional: This project uses clang-format to lint the C++/CUDA files:

sudo apt install clang-format

Running lint.sh will run both black and clang-format.

Verifying Install

Download the Mip-NeRF 360 dataset and unzip

wget http://storage.googleapis.com/gresearch/refraw360/360_v2.zip && unzip 360_v2.zip

Update the DATASET_PATH in splat_py/constants.py to garden with DOWNSAMPLE_FACTOR = 4
Run colmap_splat.py

To run all unit tests:

python -m unittest discover test

About

Unofficial implementation of 3D Gaussian Splatting in PyTorch + CUDA

MIT License

Languages

Language:Python 38.8%Language:Cuda 36.3%Language:Jupyter Notebook 23.0%Language:C++ 1.6%Language:Shell 0.3%