zyqz97 / Aerial_lifting

[CVPR'24] Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

Home Page:https://zyqz97.github.io/Aerial_Lifting/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

CVPR 2024

Yuqi Zhang · Guanying Chen · Jiaxing Chen · Shuguang Cui

Project Page

Logo

We present a neural radiance field method for urban-scale semantic and building-level instance segmentation from aerial images by lifting noisy 2D labels to 3D.

Overview

This repository contains the following components to train Aerial Lifting:

  1. Dataset processing scripts, including:
    1. far-view semantic label fusion;
    2. cross-view instance label grouping.
  2. Training and evaluation scripts.

Note: This is a preliminary release and there may still be some bugs.

Installation

Create new conda env (CUDA)

  1. Clone this repo by:

    git clone https://github.com/zyqz97/Aerial_lifting.git
    
  2. Create a conda environment (installation via anaconda is recommended.

    conda create -n aeriallift python=3.9
    conda activate aeriallift
    
  3. pytorch-version

    conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
    
  4. tiny-cuda-nn and others

    pip install -r requirements.txt
    pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
    
  5. Install the extension of torch-ngp

    cd ./gp_nerf/torch_ngp/gridencoder
    python setup.py install
    cd ../raymarching
    python setup.py install
    cd ../shencoder
    python setup.py install
    
  6. Follow the official neuralsim to install nr3d_lib.

  7. Install SAM

    git clone https://github.com/facebookresearch/segment-anything.git
    cd segment-anything
    pip install -e .
    cd tools/segment_anything
    wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
    

Tested environments

Ubuntu 20.04 with torch 1.10.1 & CUDA 11.3 on A100 GPU.

Data Processing & Training Step

  • We take Yingrenshi dataset as an example. And you need to set 'dataset_path=$YOURPATH/Aerial_lifting_data/Yingrenshi' and 'config_file=configs/yingrenshi.yaml'.

  • We also provide the processed data in the next section. The training scripts (Step 1.1, Step 2.4, and Step 3.3) can be run directly if you download the processed data.

Step 1. Training Geometry

  • 1.1 Train the geometry field.

    sh bash/train_geo.sh
    
    Note: $exp_name denotes the logs_saving path (e.g. exp_name=logs/train_geo_yingrenshi)

Step 2. Training Semantic Field

  • 2.1 Get Mask2former semantic labels

    For generating semantic labels of Mask2former, please use our modified version of Mask2former from here. You need to create a new conda env. This code is largely based on MaskFormer and a modified version of Panapti-Lifting.

    After installing the environment of Mask2former:

    sh bash/2_1_m2f_labels.sh
    
  • 2.2 Render far-view RGB images from the checkpoint of Step 1.

    sh bash/2_2_get_far_view_images.sh
    

    Note: need to specify $M2F_path, $exp_name, $ckpt_path

  • 2.3 Get fusion semantic label.

    sh bash/2_3_fusion.sh
    
  • 2.4 Train the semantic field.

    After processing or downloading the data, you can use the script below to train the semantic field.

    sh bash/train_semantic.sh
    

3. Training Instance Field

  • 3.1 Generate the SAM instance mask with geo-filter

    sh bash/3_1_get_sam_mask_depth_filter.sh
    
  • 3.2 Generate the cross-view guidance map

    sh bash/3_2_cross_view_process.sh
    
  • 3.3 Train the instance field.

    After processing or downloading the data, you can use the script below to train the instance field.

    sh bash/train_instance.sh
    

Processed Dataset & Trained Models.

Download the processed data and trained checkpoints.

We thank the authors for providing the datasets. If you find the datasets useful in your research, please cite the papers that provided the original aerial images:

@inproceedings{UrbanBIS,
title = {UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building Instance Segmentation},
author = {Guoqing Yang and Fuyou Xue and Qi Zhang and Ke Xie and Chi-Wing Fu and Hui Huang},
booktitle = {SIGGRAPH},
year = {2023},
}

@inproceedings{UrbanScene3D,
title={Capturing, Reconstructing, and Simulating: the UrbanScene3D Dataset},
author={Liqiang Lin and Yilin Liu and Yue Hu and Xingguang Yan and Ke Xie and Hui Huang},
booktitle={ECCV},
year={2022},
}

Citation

If you find this work useful for your research and applications, please cite our paper:

@inproceedings{zhang2024aerial,
  title={Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery},
  author={Zhang, Yuqi and Chen, Guanying and Chen, Jiaxing and Cui, Shuguang},
  booktitle={CVPR},
  year={2024}
}

Acknowledgements

Large parts of this codebase are based on existing work in the Mega-NeRF, torch-ngp, neuralsim, Panoptic-Lifting, Contrastive-Lift, SAM, Mask2Former. We thank the authors for releasing their code.

About

[CVPR'24] Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery

https://zyqz97.github.io/Aerial_Lifting/

License:MIT License


Languages

Language:Python 79.2%Language:Cuda 12.2%Language:C++ 6.0%Language:C 2.1%Language:Shell 0.5%Language:Batchfile 0.0%