zhigangjiang / LGT-Net

This is PyTorch implementation of our paper "LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network".(CVPR'22)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LGT-Net

This is PyTorch implementation of our paper "LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network"(CVPR'22). [Supplemental Materials] [Video] [Presentation] [Poster]

network

Update

  • 2023.5.18 Update post-processing. If you want to reproduce the post-processing results of paper, please switch to the old commit. Check out the Post-Porcessing.md for more information.

Demo

Installation

Install our dependencies:

pip install -r requirements.txt

Preparing Dataset

MatterportLayout

Office MatterportLayout dataset is at here.

If you have problems using this dataset, attention to this issue.

Make sure the dataset files are stored as follows:

src/dataset/mp3d
|-- image
|   |-- 17DRP5sb8fy_08115b08da534f1aafff2fa81fc73512.png
|-- label
|   |-- 17DRP5sb8fy_08115b08da534f1aafff2fa81fc73512.json
|-- split
    |-- test.txt
    |-- train.txt
    |-- val.txt


Statistics

Split All 4 Corners 6 Corners 8 Corners >=10 Corners
All 2295 1210 502 309 274
Train 1647 841 371 225 210
Val 190 108 46 21 15
Test 458 261 85 63 49

ZInd

Office ZInd dataset is at here.

Make sure the dataset files are stored as follows:

src/dataset/zind
|-- 0000
|   |-- panos
|   |   |-- floor_01_partial_room_01_pano_14.jpg
|   |-- zind_data.json
|-- room_shape_simplicity_labels.json
|-- zind_partition.json

Statistics

Split All 4 Corners 5 Corners 6 Corners 7 Corners 8 Corners 9 Corners >=10 Corners Manhattan No-Manhattan(%)
All 31132 17293 1803 7307 774 2291 238 1426 26664 4468(14.35%)
Train 24882 13866 1507 5745 641 1791 196 1136 21228 3654(14.69%)
Val 3080 1702 153 745 81 239 22 138 2647 433(14.06%)
Test 3170 1725 143 817 52 261 20 152 2789 381(12.02%)

PanoContext and Stanford 2D-3D

We follow the same preprocessed pano/s2d3d proposed by HorizonNet. You also can directly download the dataset file in here.

Make sure the dataset files are stored as follows:

src/dataset/pano_s2d3d
|-- test
|   |-- img
|   |   |-- camera_0000896878bd47b2a624ad180aac062e_conferenceRoom_3_frame_equirectangular_domain_.png
|   |-- label_cor
|       |-- camera_0000896878bd47b2a624ad180aac062e_conferenceRoom_3_frame_equirectangular_domain_.txt
|-- train
|   |-- img
|   |-- label_cor
|-- valid
    |-- img
    |-- label_cor
     

Downloading Pre-trained Weights

We provide pre-trained weights on individual datasets at here.

Make sure the pre-trained weight files are stored as follows:

checkpoints
|-- SWG_Transformer_LGT_Net
|   |-- ablation_study_full
|   |   |-- best.pkl
|   |-- mp3d
|   |   |-- best.pkl
|   |-- pano
|   |   |-- best.pkl
|   |-- s2d3d
|   |   |-- best.pkl
|   |-- zind
|       |-- best.pkl

Evaluation

You can evaluate by executing the following command:

  • MatterportLayout dataset
    python main.py --cfg src/config/mp3d.yaml --mode test --need_rmse
  • ZInd dataset
    python main.py --cfg src/config/zind.yaml --mode test --need_rmse
  • PanoContext dataset
    python main.py --cfg src/config/pano.yaml --mode test --need_cpe --post_processing manhattan --force_cube
  • Stanford 2D-3D dataset
    python main.py --cfg src/config/s2d3d.yaml --mode test --need_cpe --post_processing manhattan --force_cube
    • --post_processing type of post-processing approach, we use DuLa-Net post-processing and optimize by adding occlusion detection (described in here ) to process manhattan constraint (manhattan_old represents the original method), use DP algorithm to process atalanta constraint, default is disabled.
    • --need_rmse need to evaluate root mean squared error and delta error, default is disabled.
    • --need_cpe need to evaluate corner error and pixel error, default is disabled.
    • --need_f1 need to evaluate corner metrics (Precision, Recall and F$_1$-score) with 10 pixels as threshold(code from here), default is disabled.
    • --force_cube force cube shape when evaluating, default is disabled.
    • --wall_num different corner number to evaluate, default is all.
    • --save_eval save the visualization evaluating results of each panorama, the output results locate in the corresponding checkpoint directory (e.g., checkpoints/SWG_Transformer_LGT_Net/mp3d/results/test), default is disabled.

Training

Execute the following commands to train (e.g., MatterportLayout dataset):

python main.py --cfg src/config/mp3d.yaml --mode train

You can copy and modify the configuration in YAML file for other training.

Inference

We provide an inference script (inference.py) that you can try to predict your panoramas by executing the following command (e.g., using pre-trained weights of MatterportLayout dataset):

python inference.py --cfg src/config/mp3d.yaml --img_glob src/demo/demo1.png --output_dir src/output --post_processing manhattan

It will output json files(xxx_pred.json, format is the same as PanoAnnotator) and visualization images (xxx_pred.png) under output_dir. visualization image: pred

  • --img_glob a panorama path or directory path for prediction.

  • --post_processing If manhattan is selected, we will preprocess the panorama so that the vanishing points are aligned with the axes for post-processing. Note that after preprocessing our predicted results will not align with your input panoramas, you can use the output file (vp.txt) of vanishing points to reverse align them manually.

  • --visualize_3d 3D visualization of output results (need install dependencies and GUI desktop environment).

  • --output_3d output the object file of 3D mesh reconstruction.

Acknowledgements

The code style is modified based on Swin-Transformer.

Some components refer to the following projects:

Citation

If you use this code for your research, please cite

@InProceedings{jiang2022lgt,
    author    = {Jiang, Zhigang and Xiang, Zhongzheng and Xu, Jinhua and Zhao, Ming},
    title     = {LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022}
}

About

This is PyTorch implementation of our paper "LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network".(CVPR'22)

License:MIT License


Languages

Language:Python 98.8%Language:GLSL 1.2%