Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

Official implementation of 'Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training'.

The paper has been accepted by NeurIPS 2022.

News

We revise some bugs for pre-trianing and release the fine-tuning code of Point-M2AE 📌.
Our latest work, I2P-MAE has been accepted by CVPR 2023 🔥 and open-sourced. I2P-MAE leverges 2D pre-trained models to guide the pre-training of Point-M2AE and achieves SOTA performance on various 3D tasks.

Introduction

Comparison with existing MAE-based models for self-supervised 3D point cloud learning on ModelNet40 dataset:

Method	Parameters	GFlops	Extra Data	Linear SVM	Fine-tuning	Voting
Point-BERT	22.1M	4.8	-	87.4%	92.7%	93.2%
ACT	22.1M	4.8	2D	-	-	93.7%
Point-MAE	22.1M	4.8	-	91.0%	93.2%	93.8%
Point-M2AE	12.9M	3.6	-	92.9%	93.4%	94.0%
I2P-MAE	12.9M	3.6	2D	93.4%	93.7%	94.1%

Point-M2AE is a strong Multi-scale MAE pre-training framework for hierarchical self-supervised learning of 3D point clouds. Unlike the standard transformer in MAE, we modify the encoder and decoder into pyramid architectures to progressively model spatial geometries and capture both fine-grained and high-level semantics of 3D shapes. We design a multi-scale masking strategy to generate consistent visible regions across scales, and reconstruct the masked coordinates from a global-to-local perspective.

Point-M2AE Models

Pre-training

Pre-trained by ShapeNet, Point-M2AE is evaluated by Linear SVM on ModelNet40 and ScanObjectNN (OBJ-BG split) datasets, without downstream fine-tuning:

Task	Dataset	Config	MN40 Acc.	OBJ-BG Acc.	Ckpts	Logs
Pre-training	ShapeNet	point-m2ae.yaml	92.87%	84.12%	pre-train.pth	log

Fine-tuning

Synthetic shape classification on ModelNet40 with 1k points:

Task	Config	Acc.	Vote	Ckpts	Logs
Classification	modelnet40.yaml	93.43%	93.96%	modelnet40.pth	modelnet40.log

Real-world shape classification on ScanObjectNN:

Task	Split	Config	Acc.	Ckpts	Logs
Classification	PB-T50-RS	scan_pb.yaml	86.43%	scan_pd.pth	scan_pd.log
Classification	OBJ-BG	scan_obj-bg.yaml	91.22%	scan_obj-bg.pth	scan_obj-pd.log
Classification	OBJ-ONLY	scan_obj.yaml	88.81%	scan_obj.pth	scan_obj.log

Part segmentation on ShapeNetPart:

Task	Dataset	Config	mIoUc	mIoUi	Ckpts	Logs
Segmentation	ShapeNetPart	segmentation	84.86%	86.51%	seg.pth	seg.log

Few-shot classification on ModelNet40:

Task	Dataset	Config	5w10s	5w20s	10w10s	10w20s
Few-shot Cls.	ModelNet40	-	96.8%	98.3%	92.3%	95.0%

Requirements

Installation

Create a conda environment and install basic dependencies:

git clone https://github.com/ZrrSkywalker/Point-M2AE.git
cd Point-M2AE

conda create -n pointm2ae python=3.8
conda activate pointm2ae

# Install the according versions of torch and torchvision
conda install pytorch torchvision cudatoolkit
# e.g., conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3

pip install -r requirements.txt

Install GPU-related packages:

# Chamfer Distance and EMD
cd ./extensions/chamfer_dist
python setup.py install --user
cd ../emd
python setup.py install --user

# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

Datasets

For pre-training and fine-tuning, please follow DATASET.md to install ShapeNet, ModelNet40, ScanObjectNN, and ShapeNetPart datasets, referring to Point-BERT. Specially for Linear SVM evaluation, download the official ModelNet40 dataset and put the unzip folder under data/.

The final directory structure should be:

│Point-M2AE/
├──cfgs/
├──datasets/
├──data/
│   ├──ModelNet/
│   ├──ModelNetFewshot/
│   ├──modelnet40_ply_hdf5_2048/  # Specially for Linear SVM
│   ├──ScanObjectNN/
│   ├──ShapeNet55-34/
│   ├──shapenetcore_partanno_segmentation_benchmark_v0_normal/
├──...

Get Started

Pre-training

Point-M2AE is pre-trained on ShapeNet dataset with the config file cfgs/pre-training/point-m2ae.yaml. Run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/pre-training/point-m2ae.yaml --exp_name pre-train

To evaluate the pre-trained Point-M2AE by Linear SVM, create a folder ckpts/ and download the pre-train.pth into it. Use the configs in cfgs/linear-svm/ and indicate the evaluation dataset by --test_svm.

For ModelNet40, run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/linear-svm/modelnet40.yaml --test_svm modelnet40 --exp_name test_svm --ckpts ./ckpts/pre-train.pth

For ScanObjectNN (OBJ-BG split), run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/linear-svm/scan_obj-bg.yaml --test_svm scan --exp_name test_svm --ckpts ./ckpts/pre-train.pth

Fine-tuning

Please create a folder ckpts/ and download the pre-train.pth into it. The fine-tuning configs are in cfgs/fine-tuning/.

For ModelNet40, run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/modelnet40.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth

For the three splits of ScanObjectNN, run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_pb.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj-bg.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth

For ShapeNetPart, first into the segmentation/ folder, and run:

cd segmentation
CUDA_VISIBLE_DEVICES=0 python main.py --model Point_M2AE_SEG --log_dir finetune --ckpts ./ckpts/pre-train.pth

Evaluation

Please download the pre-trained models from here and put them into the folder ckpts/.

For ModelNet40 without voting, run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/modelnet40.yaml --test --exp_name finetune --ckpts ckpts/modelnet.pth

For ModelNet40 with voting, run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/modelnet40.yaml --test --vote --exp_name finetune_vote --ckpts ckpts/modelnet.pth

For the three splits of ScanObjectNN, run:

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_pb.yaml --test --exp_name finetune --ckpts ckpts/scan_pb.pth

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj.yaml --test --exp_name finetune --ckpts ckpts/scan_obj.pth

CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj-bg.yaml --test --exp_name finetune --ckpts ckpts/scan_obj-bg.pth

Acknowledgement

This repo benefits from Point-BERT and Point-MAE. Thanks for their wonderful works.

Citation

@article{zhang2022point,
  title={Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training},
  author={Zhang, Renrui and Guo, Ziyu and Gao, Peng and Fang, Rongyao and Zhao, Bin and Wang, Dong and Qiao, Yu and Li, Hongsheng},
  journal={arXiv preprint arXiv:2205.14401},
  year={2022}
}

Contact

If you have any question about this project, please feel free to contact zhangrenrui@pjlab.org.cn.

ZrrSkywalker / Point-M2AE

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

News

Introduction

Point-M2AE Models

Pre-training

Fine-tuning

Requirements

Installation

Datasets

Get Started

Pre-training

Fine-tuning

Evaluation

Acknowledgement

Citation

Contact

About

Languages