Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Updates

To be Done

Release the whole training scripts with CAGroup3D+Swin3D
Upload the models and configs for FCAF3D+Swin3D
Upload the models and configs for CAGroup3D+Swin3D

26/03/2024

Add Object Detction code:

Update Object Detection code and configs with FCAF3D+Swin3D
Update patch for CAGroup3D+Swin3D

27/04/2023

Initial commits:

The supported code and models for Semantic Segmentation on ScanNet and S3DIS are provided.

Introduction

This repo contains the experiment code for Swin3D

Environment

Install dependencies
```
   pip install -r requirements.txt
```

Refer to this repo to compile the operation of swin3d

   git clone https://github.com/microsoft/Swin3D
   cd Swin3D
   python setup.py install

If you have problems installing the package, you can use the docker we provide:

  docker pull yukichiii/torch112_cu113:swin3d

To run the code for object detection, please refer to FCAF3D(which is based on mmdetection3d) and CAGroup3D(which is based on OpenPCDet). Install the requirements for mmdetection3d and run python setup.py install to install mmdetection3d.

Data Preparation

ScanNet Segmentation Data

Please refer to https://github.com/dvlab-research/PointGroup for the ScanNetv2 preprocessing. Then change the data_root entry in the yaml files in SemanticSeg/config/scannetv2.

S3DIS Segmentation Data

Please refer to https://github.com/yanx27/Pointnet_Pointnet2_pytorch for S3DIS preprocessing. Then modify the data_root entry in the yaml files in SemanticSeg/config/s3dis.

ScanNet 3D Detection Data

Please refer to https://github.com/SamsungLabs/fcaf3d for ScanNet preprocessing. Then modify the data_root entry in the config files in ObjectDet/FCAF3D/configs/scannet_det.

S3DIS 3D Detection Data

Please refer to https://github.com/SamsungLabs/fcaf3d for S3DIS preprocessing. Then modify the data_root entry in the config files in ObjectDet/FCAF3D/configs/s3dis_det.

Training

ScanNet Segmentation

Change the work directory to SemanticSeg

  cd SemanticSeg

To train model on ScanNet Segmentation Task with Swin3D-S or Swin3D-L from scratch:

  python train.py --config config/scannetv2/swin3D_RGBN_S.yaml
  or
  python train.py --config config/scannetv2/swin3D_RGBN_L.yaml

To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB,Norm) here, and run:

  python train.py --config config/scannetv2/swin3D_RGBN_S.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGBN_S
  or
  python train.py --config config/scannetv2/swin3D_RGBN_L.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGBN_L

S3DIS Segmentation

Change the work directory to SemanticSeg

  cd SemanticSeg

To train model on S3DIS Area5 Segmentation with Swin3D-S or Swin3D-L from scratch:

  python train.py --config config/s3dis/swin3D_RGB_S.yaml
  or
  python train.py --config config/s3dis/swin3D_RGB_L.yaml

To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB) here, and run:

  python train.py --config config/s3dis/swin3D_RGB_S.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGB_S
  or
  python train.py --config config/s3dis/swin3D_RGB_L.yaml args.weight PATH_TO_PRETRAINED_SWIN3D_RGB_L

3D Object Detection

To train from sratch with FCAF3D+Swin3D:

  python -m tools.train configs/scannet_det/Swin3D_S.py

To finetune the model pretrained on Structured3D, you can download the pretrained model with cRSE(XYZ,RGB), and run:

  python -m tools.train configs/scannet_det/Swin3D_S.py --load_weights PATH_TO_PRETRAINED_SWIN3D_RGB_S
  python -m tools.train configs/scannet_det/Swin3D_L.py --load_weights PATH_TO_PRETRAINED_SWIN3D_RGB_L

Evaluation

To forward Swin3D with given checkpoint with TTA(Test Time Augmentation, we random rotate the input scan and vote the result), you can download the model below and run:

ScanNet Segmentation

  python test.py --config config/scannetv2/swin3D_RGBN_S.yaml --vote_num 12 args.weight PATH_TO_CKPT
  or
  python test.py --config config/scannetv2/swin3D_RGBN_L.yaml --vote_num 12 args.weight PATH_TO_CKPT

S3DIS Area5 Segmentation

  python test.py --config config/s3dis/swin3D_RGB_S.yaml --vote_num 12 args.weight PATH_TO_CKPT
  or
  python test.py --config config/s3dis/swin3D_RGB_L.yaml --vote_num 12 args.weight PATH_TO_CKPT

For faster forward, you can change the vote_num to 1.

3D Object Detection

For Detection task with FCAF3D+Swin3D:

  python -m tools.test configs/scannet_det/Swin3D_S.py CHECKPOINT_PATH --eval mAP --show-dir OUTPUT_PATH --out OUTPUT_PATH/result.pkl

Results and models

ScanNet Segmentation

	Pretrained	mIoU(Val)	mIoU(Test)	Model	Train	Eval
Swin3D-S	✗	75.2	-	model	log	log
Swin3D-S	✓	75.6(76.8)	-	model	log	log
Swin3D-L	✓	76.4(77.5)	77.9	model	log	log

S3DIS Segmentation

	Pretrained	Area 5 mIoU	6-fold mIoU	Model	Train	Eval
Swin3D-S	✗	72.5	76.9	model	log	log
Swin3D-S	✓	73.0	78.2	model	log	log
Swin3D-L	✓	74.5	79.8	model	log	log

ScanNet 3D Detection

	Pretrained	mAP@0.25	mAP@0.50	Model	Log
Swin3D-S+FCAF3D	✓	74.2	59.5	model	log
Swin3D-L+FCAF3D	✓	74.2	58.6	model	log
Swin3D-S+CAGroup3D	✓	76.4	62.7	model	log
Swin3D-L+CAGroup3D	✓	76.4	63.2	model	log

S3DIS 3D Detection

	Pretrained	mAP@0.25	mAP@0.50	Model	Log
Swin3D-S+FCAF3D	✓	69.9	50.2	model	log
Swin3D-L+FCAF3D	✓	72.1	54.0	model	log

Citation

If you find Swin3D useful to your research, please cite our work:

@misc{yang2023swin3d,
      title={Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding}, 
      author={Yu-Qi Yang and Yu-Xiao Guo and Jian-Yu Xiong and Yang Liu and Hao Pan and Peng-Shuai Wang and Xin Tong and Baining Guo},
      year={2023},
      eprint={2304.06906},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Yukichiii / Swin3D_Task

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Updates

Introduction

Overview

Environment

Data Preparation

ScanNet Segmentation Data

S3DIS Segmentation Data

ScanNet 3D Detection Data

S3DIS 3D Detection Data

Training

ScanNet Segmentation

S3DIS Segmentation

3D Object Detection

Evaluation

3D Object Detection

Results and models

ScanNet Segmentation

S3DIS Segmentation

ScanNet 3D Detection

S3DIS 3D Detection

Citation

About

Languages