Weekly classified NeRF

We track weekly NeRF papers and classify them. All previous published NeRF papers have been added to the list. We provide an English version and a Chinese version. We welcome contributions and corrections via PR.

We also provide an excel version (the meta data) of all NeRF papers, you can add your own comments or make your own paper analysis tools based on the structured meta data.

[CVPR22Oral] Block-NeRF: Scalable Large Scene Neural View Synthesis

1. Introduction

This project aims for benchmarking several state-of-the-art large-scale neural fields algorithms, not restricted to the original Block-NeRF algorithms. The title of this repo is BlockNeRFPytorch because it is memorizable and short.

The Block-NeRF builds the largest neural scene representation to date, capable of rendering an entire neighborhood of San Francisco. The abstract of the Block-NeRF paper is as follows:

We present Block-NeRF, a variant of Neural Radiance Fields that can represent large-scale environments. Specifically, we demonstrate that when scaling NeRF to render city-scale scenes spanning multiple blocks, it is vital to decompose the scene into individually trained NeRFs. This decomposition decouples rendering time from scene size, enables rendering to scale to arbitrarily large environments, and allows per-block updates of the environment. We adopt several architectural changes to make NeRF robust to data captured over months under different environmental conditions. We add appearance embeddings, learned pose refinement, and controllable exposure to each individual NeRF, and introduce a procedure for aligning appearance between adjacent NeRFs so that they can be seamlessly combined. We build a grid of Block-NeRFs from 2.8 million images to create the largest neural scene representation to date, capable of rendering an entire neighborhood of San Francisco.

Our reproduced results of Block-NeRF:

This project is the non-official implementation of Block-NeRF. You are expected to get the following results in this repository:

Large-scale NeRF training. The current results are as follows:

building-demo.mp4

SOTA custom scenes. Reconstruction SOTA NeRFs based on your collected photos. Here is a reconstructed video of my work station:

sm01_04.mp4

Google Colab support. Run trained Block-NeRF on Google Colab with detailed visualizations (unfinished yet):

The other features of this project would be:

PyTorch Implementation. The official Block-NeRF paper uses tensorflow and requires TPUs. However, this implementation only needs PyTorch.
GPU efficient. We ensure that almost all our experiments can be carried on eight NVIDIA 2080Ti GPUs.
Quick download. We host many datasets on Google drive so that downloading becomes much faster.
Uniform data format. The original Block-NeRF paper requires downloading tons of data from Google Cloud Platform. This repo provide processed data and convenient scripts. We provides a uniform data format that suits many datasets of large-scale neural fields.
State-of-the-art performance. This project produces state-of-the-art rendering quality with better efficiency.
Quick validation. We provide quick validation tools to evaluate your ideas so that you don't need to train on the full Block-NeRF dataset.
Open research. Along with this project, we aim to developping a strong community working on this. We welcome you to joining us (if you have a Wechat, feel free to add my Wechat ytc407). The contributors of this project are listed at the bottom of this page!
Chinese community. We will host regular Chinese tutorials and provide hands-on videos on general NeRF and building your custom NeRFs in the wild and in the city. Welcome to add my Wechat if you have a Wechat.

Welcome to star and watch this project, thank you very much!

2. News

[2022.8.31] Training Mega-NeRF on the Waymo dataset.
[2022.8.24] Support the full Mega-NeRF pipeline.
[2022.8.18] Support all previous papers in weekly classified NeRF.
[2022.8.17] Support classification in weekly NeRF.
[2022.8.16] Support evaluation scripts and data format standard. Getting some results.
[2022.8.13] Add estimated camera pose and release a better dataset.
[2022.8.12] Add weekly NeRF functions.
[2022.8.8] Add the NeRF reconstruction code and doc for custom purposes.
[2022.7.28] The data preprocess script is finished.
[2022.7.20] This project started!

3. Installation

Expand / collapse installation steps.

Create conda environment.
```
conda create -n nerf-block python=3.9
```

Install tensorflow, pytorch and other libs. Our version: tensorflow with CUDA11.7.

pip install --upgrade pip
pip install -r requirements.txt
pip install tensorflow 
pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

Install other libs used for reconstructing custom scenes, which is only needed when you need to build your scenes.
```
sudo apt-get install colmap
sudo apt-get install imagemagick  # required sudo accesss
pip install -r requirements.txt
conda install pytorch-scatter -c pyg  # or install via https://github.com/rusty1s/pytorch_scatter
```
You can use laptop version of COLMAP as well if you do not have access to sudo access on your server. However, we found if you do not set up COLMAP parameters properly, you would not get the SOTA performance.

4. Large-scale NeRF on the public datasets

Click the following sub-section titles to expand / collapse steps.

Note we provide useful debugging commands in many scripts. Debug commands require a single GPU card only and may run slower than the standard commands. You can use the standard commands instead for conducting experiments and comparisons. A sample bash file is:

# arguments
ARGUMENTS HERE  # we provide you sampled arguments with explanations and options here.
# for debugging, uncomment the following line when debugging
# DEBUG COMMAND HERE
# for standard training, comment the following line when debugging
STANDARD TRAINING COMMAND HERE

4.1 Download processed data.

What you should know before downloading the data:

(1) You don't need these steps if you only want to get results on your custom data (in other words, you can directly go to Section 5) but we recommand you to run on public datasets first.

(2) Disclaimer: you should ensure that you get the permission for usage from the original data provider. One should first sign the license on the official waymo webiste to get the permission of downloading the Waymo data. Other data should be downloaded and used without obeying the original licenses.

(3) Our processed waymo data is significantly smaller than the original version (19.1GB vs. 191GB) because we store the camera poses instead of raw ray directions. Besides, our processed data is more friendly for Pytorch dataloaders.

You can download and preprocess all of the data and pretrained models via the following commands:

bash data_proprocess/download_waymo.sh  // download waymo datasets
bash data_preprocess/download_mega.sh   // download mega datasets from the CMU server. The total size is around 31G.

(Optional) you may also download the mega dataset (which is the same as the "download_mega.sh" bash) from our Google drive. You can download selected data from this table:

Dataset name	Images & poses	Masks	Pretrained models
Waymo	waymo_image_poses	Not ready	Not ready
Building	building-pixsfm	building-pixsfm-grid-8	building-pixsfm-8.pt
Rubble	rubble-pixsfm	rubble-pixsfm-grid-8	rubble-pixsfm-8.pt
Quad	ArtsQuad_dataset - quad-pixsfm	quad-pixsfm-grid-8	quad-pixsfm-8.pt
Residence	UrbanScene3D - residence-pixsfm	residence-pixsfm-grid-8	residence-pixsfm-8.pt
Sci-Art	UrbanScene3D - sci-art-pixsfm	sci-art-pixsfm-grid-25	sci-art-pixsfm-25-w-512.pt
Campus	UrbanScene3D - campus	campus-pixsfm-grid-8	campus-pixsfm-8.pt

The data structures follow the Mega-NeRF standards. We provide detailed explanations with examples for each data structure in this doc. After downloading the data, unzip the files and make folders via the following commands:

bash data_preprocess/process_mega.sh

If you are interested in processing the raw waymo data on your own, please refer to this doc.

4.2 Run pretrained models.

We recommand you to eval the pretrained models first before you train the models. In this way, you can quickly see the results of our provided models and help you rule out many environmental issues. Run the following script to eval the pre-trained models.

bash scripts/eval_trained_models.sh
# The rendered images would be placed under ${EXP_FOLDER}, which is set to data/mega/${DATASET_NAME}/exp_logs by default.

The sample output log by running this script can be found at docs/sample_logs/eval_trained_models.txt.

4.3 Generate masks.

Why should we generate masks? (1) Masks help us transfer camera poses + images to ray-based data. In this way, we can download the raw datasets quickly and train quickly as well. (2) Masks helps us manage the boundary of rays.

Run the following script (choose one of the following two cmmands) to create masks:

bash scripts/create_cluster_mask.sh                      # for the mega dataset
bash scripts/waymo_create_cluster_mask.sh                # for the waymo dataset
# The output would be placed under the ${MASK_PATH}, which is set to data/mega/${DATASET_NAME}/building-pixsfm-grid-8 by default.

The sample output log by running this script can be found at docs/sample_logs/create_cluster_mask.txt. The middle parts of the log have been deleted to save space.

4.4 Train sub-modules.

Run the following commands to train the sub-module (the block):

bash scripts/train_sub_modules.sh SUBMODULE_INDEX         # for the mega dataset
bash scripts/waymo_train_sub_modules.sh SUBMODULE_INDEX   # for the waymo dataset
# SUBMODULE_INDEX is the index of the submodule.

The sample output log by running this script can be found at docs/sample_logs/create_cluster_mask.txt. You can also train multiple modules simutaneously via the parscript to launch all the training procedures simutaneuously. I personally don't use parscript but use the slurm launching scripts to launch all the required modules. The training time without multi-processing is around one day.

4.5 Merge modules.

Run the following commands to merge the trained modules to a unified model:

bash scripts/merge_sub_modules.sh

After that, you can go to 4.1 to eval your trained modules. The sample log can be found at docs/sample_logs/merge_sub_modules.txt.

5. Build your custom large-scale NeRF

Expand / collapse steps for building custom NeRF world.

Put your images under data folder. The structure should be like:

data
   |——————Madoka          // Your folder name here.
   |        └——————source // Source images should be put here.
   |                 └——————---|1.png
   |                 └——————---|2.png
   |                 └——————---|...

The sample data is provided in our Google drive folder. The Madoka and Otobai can be found at this link.

Run COLMAP to reconstruct scenes. This would probably cost a long time.
```
python tools/imgs2poses.py data/Madoka
```
You can replace data/Madoka by your data folder. If your COLMAP version is larger than 3.6 (which should not happen if you use apt-get), you need to change export_path to output_path in Ln67 of colmap_wrapper.py.
Training NeRF scenes.
```
python run.py --config configs/custom/Madoka.py
```
You can replace configs/custom/Madoka.py by other configs.

Validating the training results to generate a fly-through video.

python run.py --config configs/custom/Madoka.py --render_only --render_video --render_video_factor 8

6. Citations & acknowledgements

You may cite this repo to better convince the reviewers about the reproducibility of your paper. If this repo helps you, please cite it as:

@software{Zhao_PytorchBlockNeRF_2022,
author = {Zhao, Zelin and Jia, Jiaya},
month = {8},
title = {{PytorchBlockNeRF}},
url = {https://github.com/dvlab-research/BlockNeRFPytorch},
version = {0.0.1},
year = {2022}
}

The original paper Block-NeRF and Mega-NeRF can be cited as:

 @InProceedings{Tancik_2022_CVPR,
    author    = {Tancik, Matthew and Casser, Vincent and Yan, Xinchen and Pradhan, Sabeek and Mildenhall, Ben and Srinivasan, Pratul P. and Barron, Jonathan T. and Kretzschmar, Henrik},
    title     = {Block-NeRF: Scalable Large Scene Neural View Synthesis},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {8248-8258}
}

@inproceedings{turki2022mega,
  title={Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs},
  author={Turki, Haithem and Ramanan, Deva and Satyanarayanan, Mahadev},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={12922--12931},
  year={2022}
}

We refer to the code and data from DVGO, Mega-NeRF, and SVOX2, thanks for their great work!