PointMamba

A Simple State Space Model for Point Cloud Analysis

Dingkang Liang¹ *, Xin Zhou¹ *, Xinyu Wang¹ *, Xingkui Zhu¹ , Wei Xu¹, Zhikang Zou², Xiaoqing Ye², and Xiang Bai¹

¹ Huazhong University of Science & Technology, ² Baidu Inc.

(*) equal contribution

Abstract

Transformers have become one of the foundational architectures in point cloud analysis tasks due to their excellent global modeling ability. However, the attention mechanism has quadratic complexity and is difficult to extend to long sequence modeling due to limited computational resources and so on. Recently, state space models (SSM), a new family of deep sequence models, have presented great potential for sequence modeling in NLP tasks. In this paper, taking inspiration from the success of SSM in NLP, we propose PointMamba, a framework with global modeling and linear complexity. Specifically, by taking embedded point patches as input, we proposed a reordering strategy to enhance SSM's global modeling ability by providing a more logical geometric scanning order. The reordered point tokens are then sent to a series of Mamba blocks to causally capture the point cloud structure. Experimental results show our proposed PointMamba outperforms the transformer-based counterparts on different point cloud analysis datasets, while significantly saving about 44.3% parameters and 25% FLOPs, demonstrating the potential option for constructing foundational 3D vision models. We hope our PointMamba can provide a new perspective for point cloud analysis.

Overview

Main Results

Task	Dataset	Config	Acc.(Scratch)	Download (Scratch)	Acc.(pre-train)	Download (Fine-tune)
Pre-training	ShapeNet	pretrain.yaml			N.A.	here
Classification	ScanObjectNN	finetune_scan_objbg.yaml	88.30%	here	90.71%	here
Classification	ScanObjectNN	finetune_scan_objonly.yaml	87.78%	here	88.47%	here
Classification	ScanObjectNN	finetune_scan_hardest.yaml	82.48%	here	84.87%	here
Part Segmentation	ShapeNetPart	part segmentation	85.8% mIoU	here	86.0% mIoU	here

Getting Started

Environment

This codebase was tested with the following environment configurations. It may work with other versions.

Ubuntu 20.04
CUDA 11.7
Python 3.9
PyTorch 1.13.1 + cu117

Installation

We recommend using Anaconda for the installation process:

# Create virtual env and install PyTorch
$ conda create -n pointmamba python=3.9
$ conda activate pointmamba
(pointmamba) $ pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

# Install basic required packages
(pointmamba) $ pip install -r requirements.txt

# Chamfer Distance & emd
(pointmamba) $ cd ./extensions/chamfer_dist && python setup.py install --user
(pointmamba) $ cd ./extensions/emd && python setup.py install --user

# PointNet++
(pointmamba) $ pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

# GPU kNN
(pointmamba) $ pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

# Mamba
(pointmamba) $ pip install causal-conv1d>=1.1.0
(pointmamba) $ pip install mamba-ssm

Datasets

See DATASET.md for details.

Usage

Pre-train

CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/pretrain.yaml --exp_name <name>

Classification on ScanObjectNN

Training from scratch.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --scratch_model --config cfgs/finetune_scan_objbg.yaml --exp_name <name>

Training from pre-training.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --finetune_model --config cfgs/finetune_scan_objbg.yaml --ckpts <path/to/pre-trained/model> --exp_name <name>

Part Segmentation on ShapeNetPart

Training from scratch.

cd part_segmentation
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/config.yaml --log_dir <name>

Training from pre-training.

cd part_segmentation
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/config.yaml --ckpts <path/to/pre-trained/model> --log_dir <name>

To Do

Release code.
Release checkpoints.
Semantic segmentation.

Acknowledgement

This project is based on Point-BERT (paper, code), Point-MAE (paper, code), Mamba (paper, code), Causal-Conv1d (code). Thanks for their wonderful works.

Citation

If you find this repository useful in your research, please consider giving a star ⭐ and a citation

@article{liang2024pointmamba,
      title={PointMamba: A Simple State Space Model for Point Cloud Analysis}, 
      author={Dingkang Liang and Xin Zhou and Xinyu Wang and Xingkui Zhu and Wei Xu and Zhikang Zou and Xiaoqing Ye and Xiang Bai},
      journal={arXiv preprint arXiv:2402.10739},
      year={2024}
}

dk-liang / PointMamba