lx6c78 / Vision-Mamba-A-Comprehensive-Survey-and-Taxonomy

Vision Mamba: A Comprehensive Survey and Taxonomy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Vision Mamba: A Comprehensive Survey and Taxonomy

Abstract: State Space Model (SSM) is a mathematical model used to describe and analyze the behavior of dynamic systems. This model has witnessed numerous applications in several fields, including control theory, signal processing, economics and machine learning. In the field of deep learning, state space models are used to process sequence data, such as time series analysis, natural language processing (NLP) and video understanding. By mapping sequence data to state space, long-term dependencies in the data can be better captured. In particular, modern SSMs have shown strong representational capabilities in NLP, especially in long sequence modeling, while maintaining linear time complexity. Notably, based on the latest state-space models, Mamba \cite{Mamba} merges time-varying parameters into SSMs and formulates a hardware-aware algorithm for efficient training and inference. Given its impressive efficiency and strong long-range dependency modeling capability, Mamba is expected to become a new AI architecture that may outperform Transformer. Recently, a number of works have attempted to study the potential of Mamba in various fields, such as general vision, multi-modal, medical image analysis and remote sensing image analysis, by extending Mamba from natural language domain to visual domain. To fully understand Mamba in the visual domain, we conduct a comprehensive survey and present a taxonomy study. This survey focuses on Mamba's application to a variety of visual tasks and data types, and discusses its predecessors, recent advances and far-reaching impact on a wide range of domains. Since Mamba is now on an upward trend, please actively notice us if you have new findings, and new progress on Mamba will be included in this survey in a timely manner and updated on the website: (https://github.com/lx6c78/Vision-Mamba-A-Comprehensive-Survey-and-Taxonomy).

We will timely update the latest representaive literatures and their released source code on this page. If you have any questions, please don't hesitate to contact us at any of the following emails: liuxiao@stu.cqu.edu.cn, zhangchenxu@cqu.edu.cn, leizhang@cqu.edu.cn

📢 Update Log

  • 2024.05.07: Our paper is released! [arXiv]
  • 2024.05.18: Added "Latest Visual Mamba Papers" column. We plan to update these papers in subsequent versions of our survey.

Citation

If you find this repository is useful for you, please cite our paper:

@misc{liu2024vision,
      title={Vision Mamba: A Comprehensive Survey and Taxonomy}, 
      author={Xiao Liu and Chenxu Zhang and Lei Zhang},
      year={2024},
      eprint={2405.04404},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Contents

Related Survey

  • State Space Model for New-Generation Network Alternative to Transformers: A Survey. [15 April 2024] [ArXiv, 2024]
    Xiao Wang, Shiao Wang, Yuhe Ding, Yuehang Li, Wentao Wu, Yao Rong, Weizhe Kong, Ju Huang, Shihao Li, Haoxiang Yang, Ziwen Wang, Bowei Jiang, Chenglong Li, Yaowei Wang, Yonghong Tian, Jin Tang.
    [Paper] [Github]
  • A Survey on Visual Mamba. [26 April, 2024] [ArXiv, 2024]
    Hanwei Zhang, Ying Zhu, Dan Wang, Lijun Zhang, Tianxiang Chen, Zi Ye.
    [Paper]
  • Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges. [24 April, 2024] [ArXiv, 2024]
    Badri Narayana Patro, Vijay Srinivas Agneeswaran.
    [Paper] [Gihub]
  • A Survey on Vision Mamba: Models, Applications and Challenges. [29 April, 2024] [ArXiv, 2024]
    Rui Xu, Shu Yang, Yihui Wang, Bo Du, Hao Chen.
    [Paper] [Gihub]
  • Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis. [5 June, 2024] [ArXiv, 2024]
    Moein Heidari, Sina Ghorbani Kolahi, Sanaz Karimijafarbigloo, Bobby Azad, Afshin Bozorgpour, Soheila Hatami, Reza Azad, Ali Diba, Ulas Bagci, Dorit Merhof, Ilker Hacihaliloglu.
    [Paper] [Gihub]

Latest vision Mamba paper

We plan to update these papers in subsequent versions of our survey.

  • MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection. [2 August, 2024] [ArXiv, 2024]
    Xiangbo Gao, Asiegbu Miracle Kanu-Asiegbu, Xiaoxiao Du.
    [Paper] [Code]
  • Multi-head Spatial-Spectral Mamba for Hyperspectral Image Classification. [26 August, 2024] [ArXiv, 2024]
    Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Muhammad Usama, Hamad Ahmed Altuwaijri, Manuel Mazzara, Salvatore Distefano.
    [Paper]
  • WaveMamba: Spatial-Spectral Wavelet Mamba for Hyperspectral Image Classification. [2 August, 2024] [ArXiv, 2024]
    Muhammad Ahmad, Muhammad Usama, Manual Mazzara.
    [Paper]
  • Wave-Mamba: Wavelet State Space Model for Ultra-High-Definition Low-Light Image Enhancement. [2 August, 2024] [ArXiv, 2024]
    Wenbin Zou, Hongxia Gao, Weipeng Yang, Tongtong Liu.
    [Paper] [Code]
  • Spatial-Spectral Morphological Mamba for Hyperspectral Image Classification. [23 August, 2024] [ArXiv, 2024]
    Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Muhammad Usama, Adil Mehmood Khan, Manuel Mazzara, Salvatore Distefano, Hamad Ahmed Altuwaijri, Swalpa Kumar Roy, Jocelyn Chanussot, Danfeng Hong.
    [Paper]
  • JambaTalk: Speech-Driven 3D Talking Head Generation Based on Hybrid Transformer-Mamba Model. [2 August, 2024] [ArXiv, 2024]
    Farzaneh Jafari, Stefano Berretti, Anup Basu.
    [Paper]
  • DeMansia: Mamba Never Forgets Any Tokens. [4 August, 2024] [ArXiv, 2024]
    Ricky Fang.
    [Paper] [Code]
  • BioMamba: A Pre-trained Biomedical Language Representation Model Leveraging Mamba. [5 August, 2024] [ArXiv, 2024]
    Ling Yue, Sixue Xing, Yingzhou Lu, Tianfan Fu.
    [Paper]
  • LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba. [5 August, 2024] [ArXiv, 2024]
    Yunxiang Fu, Chaoqi Chen, Yizhou Yu.
    [Paper]
  • Context-aware Mamba-based Reinforcement Learning for social robot navigation. [5 August, 2024] [ArXiv, 2024]
    Syed Muhammad Mustafa, Omema Rizvi, Zain Ahmed Usmani, Abdul Basit Memon.
    [Paper]
  • Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network. [7 August, 2024] [ArXiv, 2024]
    Xinyi Zhang, Qiqi Bao, Qinpeng Cui, Wenming Yang, Qingmin Liao.
    [Paper]
  • PoseMamba: Monocular 3D Human Pose Estimation with Bidirectional Global-Local Spatio-Temporal State Space Model. [7 August, 2024] [ArXiv, 2024]
    Yunlong Huang, Junshuo Liu, Ke Xian, Robert Caiming Qiu.
    [Paper]
  • Neural Architecture Search based Global-local Vision Mamba for Palm-Vein Recognition. [13 August, 2024] [ArXiv, 2024]
    Huafeng Qin, Yuming Fu, Jing Chen, Mounim A. El-Yacoubi, Xinbo Gao, Jun Wang.
    [Paper]
  • Costal Cartilage Segmentation with Topology Guided Deformable Mamba: Method and Benchmark. [14 August, 2024] [ArXiv, 2024]
    Senmao Wang, Haifan Gong, Runmeng Cui, Boyao Wan, Yicheng Liu, Zhonglin Hu, Haiqing Yang, Jingyang Zhou, Bo Pan, Lin Lin, Haiyue Jiang.
    [Paper]
  • MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking. [14 August, 2024] [ArXiv, 2024]
    Simiao Lai, Chang Liu, Jiawen Zhu, Ben Kang, Yang Liu, Dong Wang, Huchuan Lu.
    [Paper]
  • MambaMIM: Pre-training Mamba with State Space Token-interpolation. [15 August, 2024] [ArXiv, 2024]
    Fenghe Tang, Bingkun Nian, Yingtai Li, Jie Yang, Liu Wei, S. Kevin Zhou.
    [Paper] [Code]
  • ColorMamba: Towards High-quality NIR-to-RGB Spectral Translation with Mamba. [15 August, 2024] [ArXiv, 2024]
    Huiyu Zhai, Guang Jin, Xingxing Yang, Guosheng Kang.
    [Paper] [Code]
  • QMambaBSR: Burst Image Super-Resolution with Query State Space Model. [16 August, 2024] [ArXiv, 2024]
    Xin Di, Long Peng, Peizhe Xia, Wenbo Li, Renjing Pei, Yang Cao, Yang Wang, Zheng-Jun Zha.
    [Paper]
  • RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba. [16 August, 2024] [ArXiv, 2024]
    Andong Lu, Wanyu Wang, Chenglong Li, Jin Tang, Bin Luo.
    [Paper]
  • MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model. [17 August, 2024] [ArXiv, 2024]
    Changcheng Xiao, Qiong Cao, Zhigang Luo, Long Lan.
    [Paper]
  • R2GenCSR: Retrieving Context Samples for Large Language Model based X-ray Medical Report Generation. [19 August, 2024] [ArXiv, 2024]
    Xiao Wang, Yuehang Li, Fuling Wang, Shiao Wang, Chuanfu Li, Bo Jiang.
    [Paper] [Code]
  • Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms. [19 August, 2024] [ArXiv, 2024]
    Xiao Wang, Shiao Wang, Pengpeng Shao, Bo Jiang, Lin Zhu, Yonghong Tian.
    [Paper] [Code]
  • OccMamba: Semantic Occupancy Prediction with State Space Models. [19 August, 2024] [ArXiv, 2024]
    Heng Li, Yuenan Hou, Xiaohan Xing, Xiao Sun, Yanyong Zhang.
    [Paper]
  • Multi-Scale Representation Learning for Image Restoration with State-Space Model. [19 August, 2024] [ArXiv, 2024]
    Yuhong He, Long Peng, Qiaosi Yi, Chen Wu, Lu Wang.
    [Paper]
  • MambaEVT: Event Stream based Visual Object Tracking using State Space Model. [19 August, 2024] [ArXiv, 2024]
    Xiao Wang, Chao wang, Shiao Wang, Xixi Wang, Zhicheng Zhao, Lin Zhu, Bo Jiang.
    [Paper] [Code]
  • Event Stream based Sign Language Translation: A High-Definition Benchmark Dataset and A New Algorithm. [19 August, 2024] [ArXiv, 2024]
    Xiao Wang, Yao Rong, Fuling Wang, Jianing Li, Lin Zhu, Bo Jiang, Yaowei Wang.
    [Paper] [Code]
  • MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval. [20 August, 2024] [ArXiv, 2024]
    Haoran Tang, Meng Cao, Jinfa Huang, Ruyang Liu, Peng Jin, Ge Li, Xiaodan Liang.
    [Paper] [Code]
  • MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation. [20 August, 2024] [ArXiv, 2024]
    Jintao Cheng, Xingming Chen, Jinxin Liang, Xiaoyu Tang, Xieyuanli Chen, Dachuan Li.
    [Paper] [Code]
  • OMEGA: Efficient Occlusion-Aware Navigation for Air-Ground Robot in Dynamic Environments via State Space Model. [20 August, 2024] [ArXiv, 2024]
    Junming Wang, Dong Huang, Xiuxian Guan, Zekai Sun, Tianxiang Shen, Fangming Liu, Heming Cui.
    [Paper] [Homepage] [Code]
  • DemMamba: Alignment-free Raw Video Demoireing with Frequency-assisted Spatio-Temporal Mamba. [20 August, 2024] [ArXiv, 2024]
    Shuning Xu, Xina Liu, Binbin Song, Xiangyu Chen, Qiubo Chen, Jiantao Zhou.
    [Paper]
  • MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling. [20 August, 2024] [ArXiv, 2024]
    Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi.
    [Paper]
  • HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation. [20 August, 2024] [ArXiv, 2024]
    Mingya Zhang, Limei Gu, Tingshen Ling, Xianping Tao.
    [Paper] [Code]
  • MambaOcc: Visual State Space Model for BEV-based Occupancy Prediction with Local Adaptive Reordering. [21 August, 2024] [ArXiv, 2024]
    Yonglin Tian, Songlin Bai, Zhiyao Luo, Yutong Wang, Yisheng Lv, Fei-Yue Wang.
    [Paper] [Code]
  • UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images. [26 August, 2024] [ArXiv, 2024]
    Enze Zhu, Zhan Chen, Dingkai Wang, Hanru Shi, Xiaoxuan Liu, Lei Wang.
    [Paper] [Code]
  • MambaCSR: Dual-Interleaved Scanning for Compressed Image Super-Resolution With SSMs. [21 August, 2024] [ArXiv, 2024]
    Yulin Ren, Xin Li, Mengxi Guo, Bingchen Li, Shijie Zhao, Zhibo Chen.
    [Paper] [Code]
  • Scalable Autoregressive Image Generation with Mamba. [22 August, 2024] [ArXiv, 2024]
    Haopeng Li, Jinyue Yang, Kexin Wang, Xuerui Qiu, Yuhong Chou, Xin Li, Guoqi Li.
    [Paper] [Code]
  • Adapt CLIP as Aggregation Instructor for Image Dehazing. [22 August, 2024] [ArXiv, 2024]
    Xiaozhe Zhang, Fengying Xie, Haidong Ding, Linpeng Pan, Zhenwei Shi.
    [Paper]
  • O-Mamba: O-shape State-Space Model for Underwater Image Enhancement. [22 August, 2024] [ArXiv, 2024]
    Chenyu Dong, Chen Zhao, Weiling Cai, Bo Yang.
    [Paper] [Code]
  • MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation. [25 August, 2024] [ArXiv, 2024]
    Chaowei Chen, Li Yu, Shiquan Min, Shunfang Wang.
    [Paper] [Code]
  • ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation. [26 August, 2024] [ArXiv, 2024]
    Ruohua Shi, Qiufan Pang, Lei Ma, Lingyu Duan, Tiejun Huang, Tingting Jiang.
    [Paper]
  • LoG-VMamba: Local-Global Vision Mamba for Medical Image Segmentation. [26 August, 2024] [ArXiv, 2024]
    Trung Dinh Quoc Dang, Huy Hoang Nguyen, Aleksei Tiulpin.
    [Paper] [Code]
  • ZeroMamba: Exploring Visual State Space Model for Zero-Shot Learning. [27 August, 2024] [ArXiv, 2024]
    Wenjin Hou, Dingjie Fu, Kun Li, Shiming Chen, Hehe Fan, Yi Yang.
    [Paper] [Code]
  • MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders. [27 August, 2024] [ArXiv, 2024]
    Baijiong Lin, Weisen Jiang, Pengguang Chen, Shu Liu, Ying-Cong Chen.
    [Paper] [Code]****

General Vision

1 High-level/Mid-level Vision

1.1 Vision Backbone with Mamba

  • Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model. [10 February, 2024] [ArXiv, 2024]
    Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, Xinggang Wang.
    [Paper] [Code]
  • VMamba: Visual State Space Model. [10 April, 2024] [ArXiv, 2024]
    Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, Yunfan Liu.
    [Paper] [Code]
  • Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data. [19 March, 2024] [ArXiv, 2024]
    Shufan Li, Harkanwar Singh, Aditya Grover.
    [Paper] [Code]
  • LocalMamba: Visual State Space Model with Windowed Selective Scan. [14 March, 2024] [ArXiv, 2024]
    Tao Huang, Xiaohuan Pei, Shan You, Fei Wang, Chen Qian, Chang Xu.
    [Paper] [Code]
  • EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba. [14 March, 2024] [ArXiv, 2024]
    Xiaohuan Pei, Tao Huang, Chang Xu.
    [Paper] [Code]
  • SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series. [24 April, 2024] [ArXiv, 2024]
    Badri N. Patro, Vijay S. Agneeswaran.
    [Paper] [Code]
  • PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition. [26 March, 2024] [ArXiv, 2024]
    Chenhongyi Yang, Zehui Chen, Miguel Espinosa, Linus Ericsson, Zhenyu Wang, Jiaming Liu, Elliot J. Crowley.
    [Paper] [Code]
  • On the low-shot transferability of [V]-Mamba. [15 March, 2024] [ArXiv, 2024]
    Diganta Misra, Jay Gala, Antonio Orvieto.
    [Paper]
  • DGMamba: Domain Generalization via Generalized State Space Model. [11 April, 2024] [ArXiv, 2024]
    Shaocong Long, Qianyu Zhou, Xiangtai Li, Xuequan Lu, Chenhao Ying, Yuan Luo, Lizhuang Ma, Shuicheng Yan.
    [Paper] [Code]
  • MambaOut: Do We Really Need Mamba for Vision? [14 May, 2024] [ArXiv, 2024]
    Weihao Yu, Xinchao Wang.
    [Paper] [Code]
  • Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model. [23 May, 2024] [ArXiv, 2024]
    Yuheng Shi, Minjing Dong, Chang Xu.
    [Paper] [Code]
  • Mamba-R: Vision Mamba ALSO Needs Registers. [23 May, 2024] [ArXiv, 2024]
    Feng Wang, Jiahao Wang, Sucheng Ren, Guoyizhe Wei, Jieru Mei, Wei Shao, Yuyin Zhou, Alan Yuille, Cihang Xie.
    [Paper] [Homepage] [Code]
  • Demystify Mamba in Vision: A Linear Attention Perspective. [26 May, 2024] [ArXiv, 2024]
    Dongchen Han, Ziyi Wang, Zhuofan Xia, Yizeng Han, Yifan Pu, Chunjiang Ge, Jun Song, Shiji Song, Bo Zheng, Gao Huang.
    [Paper] [Code]
  • Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain. [28 May, 2024] [ArXiv, 2024]
    Juntao Zhang, Kun Bian, Peng Cheng, Wenbo An, Jianning Liu, Jun Zhou.
    [Paper] [Code]
  • Mamba YOLO: SSMs-Based YOLO For Object Detection. [9 June, 2024] [ArXiv, 2024]
    Zeyu Wang, Chen Li, Huiying Xu, Xinzhong Zhu.
    [Paper] [Code]
  • Autoregressive Pretraining with Mamba in Vision. [11 June, 2024] [ArXiv, 2024]
    Sucheng Ren, Xianhang Li, Haoqin Tu, Feng Wang, Fangxun Shu, Lei Zhang, Jieru Mei, Linjie Yang, Peng Wang, Heng Wang, Alan Yuille, Cihang Xie.
    [Paper] [Code]
  • Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model. [27 June, 2024] [ArXiv, 2024]
    Haobo Yuan, Xiangtai Li, Lu Qi, Tao Zhang, Ming-Hsuan Yang, Shuicheng Yan, Chen Change Loy.
    [Paper] [Code]
  • Scalable Visual State Space Model with Fractal Scanning. [26 May, 2024] [ArXiv, 2024]
    Lv Tang, HaoKe Xiao, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Bo Li.
    [Paper]
  • MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders. [14 July, 2024] [ArXiv, 2024]
    Baijiong Lin, Weisen Jiang, Pengguang Chen, Yu Zhang, Shu Liu, Ying-Cong Chen.
    [Paper] [Code]
  • Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning. [8 July, 2024] [ArXiv, 2024]
    Xiaojie Li, Yibo Yang, Jianlong Wu, Bernard Ghanem, Liqiang Nie, Min Zhang.<br.> [Paper] [Code]
  • MambaVision: A Hybrid Mamba-Transformer Vision Backbone. [10 July, 2024] [ArXiv, 2024]
    Ali Hatamizadeh, Jan Kautz.
    [Paper] [Code]
  • GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model. [18 July, 2024] [ArXiv, 2024]
    Abdelrahman Shaker, Syed Talal Wasim, Salman Khan, Juergen Gall, Fahad Shahbaz Khan.
    [Paper] [Code]

1.2 Video Analysis and Understanding

  • VideoMamba: State Space Model for Efficient Video Understanding. [March, 2024] [ArXiv, 2024]
    Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao.
    [Paper] [Code]
  • Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding. [14 March, 2024] [ArXiv, 2024]
    Guo Chen, Yifei Huang, Jilan Xu, Baoqi Pei, Zhe Chen, Zhiqi Li, Jiahao Wang, Kunchang Li, Tong Lu, Limin Wang.
    [Paper] [Code]
  • RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos. [9 April, 2024] [ArXiv, 2024]
    Bochao Zou, Zizheng Guo, Xiaocheng Hu, Huimin Ma.
    [Paper] [Code]
  • VideoMambaPro: A Leap Forward for Mamba in Video Understanding. [27 June, 2024] [ArXiv, 2024]
    Hui Lu, Albert Ali Salah, Ronald Poppe.
    [Paper] [Code]
  • DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark. [30 May, 2024] [ArXiv, 2024]
    Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li.
    [Paper] [Code]
  • QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting. [4 July, 2024] [ArXiv, 2024]
    Zeyun Zhong, Manuel Martin, Frederik Diederichs, Juergen Beyerer.
    [Paper]
  • VideoMamba: Spatio-Temporal Selective State Space Model. [11 July, 2024] [ArXiv, 2024]
    Jinyoung Park, Hee-Seon Kim, Kangwook Ko, Minbeom Kim, Changick Kim.
    [Paper] [Code]
  • Harnessing Temporal Causality for Advanced Temporal Action Detection. [25 July, 2024] [ArXiv, 2024]
    Shuming Liu, Lin Sui, Chen-Lin Zhang, Fangzhou Mu, Chen Zhao, Bernard Ghanem.
    [Paper] [Code]

1.3 Down-stream Visual Applications

  • Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning. [28 April, 2024] [ArXiv, 2024]
    Chi-Sheng Chen, Guan-Ying Chen, Dong Zhou, Di Jiang, Dai-Shi Chen.
    [Paper] [Code]
  • InsectMamba: Insect Pest Classification with State Space Model. [4 April, 2024] [ArXiv, 2024]
    Qianning Wang, Chenglin Wang, Zhixin Lai, Yucheng Zhou.
    [Paper]
  • MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection. [17 March, 2024] [ArXiv, 2024]
    Tianxiang Chen, Zhentao Tan, Tao Gong, Qi Chu, Yue Wu, Bin Liu, Jieping Ye, Nenghai Yu.
    [Paper] [Code]
  • MemoryMamba: Memory-Augmented State Space Model for Defect Recognition. [6 May, 2024] [ArXiv, 2024]
    Qianning Wang, He Hu, Yucheng Zhou.
    [Paper]
  • FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space. [9 May, 2024] [ArXiv, 2024]
    Hui Ma, Sen Lei, Turgay Celik, Heng-Chao Li.
    [Paper] [Code]
  • OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition. [13 May, 2024] [ArXiv, 2024]
    Qiuchi Xiang, Jintao Cheng, Jiehao Luo, Jin Wu, Rui Fan, Xieyuanli Chen, Xiaoyu Tang.
    [Paper]
  • TokenUnify: Scalable Autoregressive Visual Pre-training with Mixture Token Prediction. [27 May, 2024] [ArXiv, 2024]
    Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, Feng Wu.
    [Paper] [Code]
  • MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation. [6 June, 2024] [ArXiv, 2024]
    Ionuţ Grigore, Călin-Adrian Popa.
    [Paper] [Code]
  • Q-Mamba: On First Exploration of Vision Mamba for Image Quality Assessment. [13 June, 2024] [ArXiv, 2024]
    Fengbin Guan, Xin Li, Zihao Yu, Yiting Lu, Zhibo Chen.
    [Paper]
  • SUM: Saliency Unification through Mamba for Visual Attention Modeling. [25 June, 2024] [ArXiv, 2024]
    Alireza Hosseini, Amirhossein Kazerouni, Saeed Akhavan, Michael Brudno, Babak Taati.
    [Paper] [Code]
  • VMambaCC: A Visual State Space Model for Crowd Counting. [6 May, 2024] [ArXiv, 2024]
    Hao-Yuan Ma, Li Zhang, Shuai Shi.
    [Paper]
  • MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection. [1 August, 2024] [ArXiv, 2024]
    Youjia Fu, Zihao Xu, Junsong Fu, Huixia Xue, Shuqiu Tan, Lei Li.
    [Paper]

2 Low-level Vision

2.1 Image Denoising and Enhancement

  • U-shaped Vision Mamba for Single Image Dehazing. [15 February, 2024] [ArXiv, 2024]
    Zhuoran Zheng, Chen Wu.
    [Paper] [Code]
  • FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining. [15 April, 2024] [ArXiv, 2024]
    Zou Zhen, Yu Hu, Zhao Feng.
    [Paper]
  • FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining. [29 May, 2024] [ArXiv, 2024]
    Dong Li, Yidi Liu, Xueyang Fu, Senyan Xu, Zheng-Jun Zha.
    [Paper]
  • LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding Network. [3 June, 2024] [ArXiv, 2024]
    Xuanqi Zhang, Haijin Zeng, Jinwang Pan, Qiangqiang Shen, Yongyong Chen.
    [Paper]
  • PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement. [12 June, 2024] [ArXiv, 2024]
    Wei-Tung Lin, Yong-Xiang Lin, Jyun-Wei Chen, Kai-Lung Hua.
    [Paper] [Code]

2.2 Image Restoration

  • MambaIR: A Simple Baseline for Image Restoration with State-Space Model. [25 March, 2024] [ArXiv, 2024]
    Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, Shu-Tao Xia.
    [Paper] [Code]
  • Activating Wider Areas in Image Super-Resolution. [13 March, 2024] [ArXiv, 2024]
    Cheng Cheng, Hang Wang, Hongbin Sun.
    [Paper]
  • CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration. [17 April, 2024] [ArXiv, 2024]
    Rui Deng, Tianpei Gu.
    [Paper]
  • VmambaIR: Visual State Space Model for Image Restoration. [17 March, 2024] [ArXiv, 2024]
    Yuan Shi, Bin Xia, Xiaoyu Jin, Xing Wang, Tianyu Zhao, Xin Xia, Xuefeng Xiao, Wenming Yang.
    [Paper] [Code]
  • Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement. [6 May, 2024] [ArXiv, 2024]
    Jiesong Bai, Yuhao Yin, Qiyuan He.
    [Paper] [Code]
  • DVMSR: Distillated Vision Mamba for Efficient Super-Resolution. [11 May, 2024] [ArXiv, 2024]
    Xiaoyan Lei, Wenlong Zhang, Weifeng Cao.
    [Paper] [Code]
  • IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model. [16 May, 2024] [ArXiv, 2024]
    Yongsong Huang, Tomo Miyazaki, Xiaofeng Liu, Shinichiro Omachi.
    [Paper] [Code]
  • LFMamba: Light Field Image Super-Resolution with State Space Model. [18 June, 2024] [ArXiv, 2024]
    Wang xia, Yao Lu, Shunzhou Wang, Ziqi Wang, Peiqi Xia, Tianfei Zhou.
    [Paper]
  • Mamba-based Light Field Super-Resolution with Efficient Subspace Scanning. [23 June, 2024] [ArXiv, 2024]
    Ruisheng Gao, Zeyu Xiao, Zhiwei Xiong.
    [Paper]
  • MxT: Mamba x Transformer for Image Inpainting. [26 July, 2024] [ArXiv, 2024]
    Shuang Chen, Amir Atapour-Abarghouei, Haozheng Zhang, Hubert P. H. Shum.
    [Paper]
  • GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB Images. [13 May, 2024] [ArXiv, 2024]
    Xinying Wang, Zhixiong Huang, Sifan Zhang, Jiawen Zhu, Lin Feng.
    [Paper] [Code]
  • Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging. [1 June, 2024] [ArXiv, 2024]
    Jiahua Dong, Hui Yin, Hongliu Li, Wenbo Li, Yulun Zhang, Salman Khan, Fahad Shahbaz Khan.
    [Paper] [Code]
  • HTD-Mamba: Efficient Hyperspectral Target Detection with Pyramid State Space Model. [17 July, 2024] [ArXiv, 2024]
    Dunbin Shen, Xuanbing Zhu, Jiacheng Tian, Jianjun Liu, Zhenrong Du, Hongyu Wang, Xiaorui Ma.
    [Paper] [Code]
  • Empowering Snapshot Compressive Imaging: Spatial-Spectral State Space Model with Across-Scanning and Local Enhancement. [1 August, 2024] [ArXiv, 2024]
    Wenzhe Tian, Haijin Zeng, Yin-Ping Zhao, Yongyong Chen, Zhen Wang, Xuelong Li.
    [Paper]

3 3-D Visual Recognition

3.1 Point Could Analysis

  • PointMamba: A Simple State Space Model for Point Cloud Analysis. [2 April, 2024] [ArXiv, 2024]
    Dingkang Liang, Xin Zhou, Xinyu Wang, Xingkui Zhu, Wei Xu, Zhikang Zou, Xiaoqing Ye, Xiang Bai.
    [Paper] [Code]
  • Point Cloud Mamba: Point Cloud Learning via State Space Model. [29 March, 2024] [ArXiv, 2024]
    Tao Zhang, Xiangtai Li, Haobo Yuan, Shunping Ji, Shuicheng Yan.
    [Paper] [Code]
  • Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy. [17 March, 2024] [ArXiv, 2024]
    Jiuming Liu, Ruiji Yu, Yian Wang, Yu Zheng, Tianchen Deng, Weicai Ye, Hesheng Wang.
    [Paper] [Code]
  • 3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion. [10 April, 2024] [ArXiv, 2024]
    Yixuan Li, Weidong Yang, Ben Fei.
    [Paper]
  • Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba. [9 May, 2024] [ArXiv, 2024]
    Hongwei Ren, Yue Zhou, Jiadong Zhu, Haotian Fu, Yulong Huang, Xiaopeng Lin, Yuetong Fang, Fei Ma, Hao Yu, Bojun Cheng.
    [Paper]
  • MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models. [23 May, 2024] [ArXiv, 2024]
    Jiuming Liu, Jinru Han, Lihao Liu, Angelica I. Aviles-Rivero, Chaokang Jiang, Zhe Liu, Hesheng Wang.
    [Paper]
  • PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis. [24 May, 2024] [ArXiv, 2024]
    Zicheng Wang, Zhenghao Chen, Yiming Wu, Zhen Zhao, Luping Zhou, Dong Xu.
    [Paper] [Code]
  • LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling. [27 May, 2024] [ArXiv, 2024]
    Yaohua Zha, Naiqi Li, Yanzi Wang, Tao Dai, Hang Guo, Bin Chen, Zhi Wang, Zhihao Ouyang, Shu-Tao Xia.
    [Paper]
  • Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs. [7 June, 2024] [ArXiv, 2024]
    Shentong Mo.
    [Paper]
  • PointABM:Integrating Bidirectional State Space Model with Multi-Head Self-Attention for Point Cloud Analysis. [10 June, 2024] [ArXiv, 2024]
    Jia-wei Chen, Yu-jie Xiong, Yong-bin Gao.
    [Paper]
  • Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model. [25 June, 2024] [ArXiv, 2024]
    Zhuoyuan Li, Yubo Ai, Jiahao Lu, ChuXin Wang, Jiacheng Deng, Hanzhi Chang, Yanzhe Liang, Wenfei Yang, Shifeng Zhang, Tianzhu Zhang.
    [Paper]
  • Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection. [18 June, 2024] [ArXiv, 2024]
    Guowen Zhang, Lue Fan, Chenhang He, Zhen Lei, Zhaoxiang Zhang, Lei Zhang.
    [Paper] [Code]
  • Serialized Point Mamba: A Serialized Point Cloud Mamba Segmentation Model. [17 July, 2024] [ArXiv, 2024]
    Tao Wang, Wei Wen, Jingzhi Zhai, Kang Xu, Haoming Luo.
    [Paper]

3.2 Hyperspectral Imaging Analysis

  • Mamba-FETrack: Frame-Event Tracking via State Space Model. [28 April, 2024] [ArXiv, 2024]
    Ju Huang, Shiao Wang, Shuai Wang, Zhe Wu, Xiao Wang, Bo Jiang.
    [Paper] [Code]
  • 3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification. [21 May, 2024] [ArXiv, 2024]
    Yan He, Bing Tu, Bo Liu, Jun Li, Antonio Plaza.
    [Paper]
  • DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification. [11 June, 2024] [ArXiv, 2024]
    Jiamu Sheng, Jingyi Zhou, Jiong Wang, Peng Ye, Jiayuan Fan.
    [Paper]

4 Visual Data Generation

  • ZigMa: A DiT-style Zigzag Mamba Diffusion Model. [1 April, 2024] [ArXiv, 2024]
    Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Fischer, Björn Ommer.
    [Paper] [Homepage] [Code]
  • Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM. [19 March, 2024] [ArXiv, 2024]
    Zeyu Zhang, Akide Liu, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tang.
    [Paper] [Homepage] [Code]
  • Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction. [29 March, 2024] [ArXiv, 2024]
    Qiuhong Shen, Xuanyu Yi, Zike Wu, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wang.
    [Paper]
  • Matten: Video Generation with Mamba-Attention. [5 May, 2024] [ArXiv, 2024]
    Yu Gao, Jiancheng Huang, Xiaopeng Sun, Zequn Jie, Yujie Zhong, Lin Ma.
    [Paper]
  • SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion. [5 May, 2024] [ArXiv, 2024]
    Ziyun Qian, Zeyu Xiao, Zhenyi Wu, Dingkang Yang, Mingcheng Li, Shunli Wang, Shuaibing Wang, Dongliang Kou, Lihua Zhang.
    [Paper]
  • DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis. [23 May, 2024] [ArXiv, 2024]
    Yao Teng, Yue Wu, Han Shi, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu.
    [Paper] [Code]
  • Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation. [24 May, 2024] [ArXiv, 2024]
    Shentong Mo, Yapeng Tian.
    [Paper]
  • Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent. [27 May, 2024] [ArXiv, 2024]
    Yi Xu, Yun Fu.
    [Paper] [Code]
  • Dimba: Transformer-Mamba Diffusion Models. [3 June, 2024] [ArXiv, 2024]
    Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Youqiang Zhang, Junshi Huang.
    [Paper] [Homepage] [Code]
  • Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba. [12 July, 2024] [ArXiv, 2024]
    Haoye Dong, Aviral Chharia, Wenbo Gou, Francisco Vicente Carrasco, Fernando De la Torre.
    [Paper] [Homepage] [Code]
  • InfiniMotion: Mamba Boosts Memory in Transformer for Arbitrary Long Motion Generation. [13 July, 2024] [ArXiv, 2024]
    Zeyu Zhang, Akide Liu, Qi Chen, Feng Chen, Ian Reid, Richard Hartley, Bohan Zhuang, Hao Tang.
    [Paper] [Homepage]
  • OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting. [15 July, 2024] [ArXiv, 2024]
    Penglei Gao, Kai Yao, Tiandi Ye, Steven Wang, Yuan Yao, Xiaofeng Wang.
    [Paper]

Multi-Modal

1 Heterologous Stream

1.1 Multi-Modal Understanding

  • MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models. [14 March, 2024] [ArXiv, 2024]
    Zunnan Xu, Yukang Lin, Haonan Han, Sicheng Yang, Ronghui Li, Yachao Zhang, Xiu Li.
    [Paper]
  • ReMamber: Referring Image Segmentation with Mamba Twister. [26 March, 2024] [ArXiv, 2024]
    Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya Zhang, Yanfeng Wang.
    [Paper]
  • SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding. [1 April, 2024] [ArXiv, 2024]
    Wenrui Li, Xiaopeng Hong, Xiaopeng Fan.
    [Paper]
  • An Empirical Study of Mamba-based Pedestrian Attribute Recognition. [14 July, 2024] [ArXiv, 2024]
    Xiao Wang, Weizhe Kong, Jiandong Jin, Shiao Wang, Ruichong Gao, Qingchuan Ma, Chenglong Li, Jin Tang.
    [Paper] [Code]

1.2 Multimodal large language models

  • VL-Mamba: Exploring State Space Models for Multimodal Learning. [20 March, 2024] [ArXiv, 2024]
    Yanyuan Qiao, Zheng Yu, Longteng Guo, Sihan Chen, Zijia Zhao, Mingzhen Sun, Qi Wu, Jing Liu.
    [Paper] [Homepage] [Code]
  • Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference. [22 March, 2024] [ArXiv, 2024]
    Han Zhao, Min Zhang, Wei Zhao, Pengxiang Ding, Siteng Huang, Donglin Wang.
    [Paper] [Homepage] [Code]
  • CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation. [30 April, 2024] [ArXiv, 2024]
    Weiquan Huang, Yifei Shen, Yifan Yang.
    [Paper] [Code]
  • Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models. [27 May, 2024] [ArXiv, 2024]
    Byung-Kwan Lee, Chae Won Kim, Beomchan Park, Yong Man Ro.
    [Paper] [Code]
  • RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation. [6 June, 2024] [ArXiv, 2024]
    Jiaming Liu, Mengzhen Liu, Zhenyu Wang, Lily Lee, Kaichen Zhou, Pengju An, Senqiao Yang, Renrui Zhang, Yandong Guo, Shanghang Zhang.
    [Paper] [Homepage] [Code]

2 Homologous Stream

  • Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation. [5 April, 2024] [ArXiv, 2024]
    Zifu Wan, Yuhao Wang, Silong Yong, Pingping Zhang, Simon Stepputtis, Katia Sycara, Yaqi Xie.
    [Paper] [Code]
  • Fusion-Mamba for Cross-modality Object Detection. [14 April, 2024] [ArXiv, 2024]
    Wenhao Dong, Haodong Zhu, Shaohui Lin, Xiaoyan Luo, Yunhang Shen, Xuhui Liu, Juan Zhang, Guodong Guo, Baochang Zhang.
    [Paper]

Vertical Application

1 Remote Sensing Image

1.1 Remote Sensing Image Processing

  • Pan-Mamba: Effective pan-sharpening with State Space Model. [8 March, 2024] [ArXiv, 2024]
    Xuanhua He, Ke Cao, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou.
    [Paper] [Code]
  • HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising. [15 April, 2024] [ArXiv, 2024]
    Yang Liu, Jiahua Xiao, Yu Guo, Peilin Jiang, Haiwei Yang, Fei Wang.
    [Paper]
  • SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising. [15 May, 2024] [ArXiv, 2024]
    Guanyiman Fu, Fengchao Xiong, Jianfeng Lu, Jun Zhou, Yuntao Qian.
    [Paper] [Code]
  • Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution. [8 May, 2024] [ArXiv, 2024]
    Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, Chia-Wen Lin.
    [Paper]
  • RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing. [16 May, 2024] [ArXiv, 2024]
    Huiling Zhou, Xianhao Wu, Hongming Chen, Xiang Chen, Xin He.
    [Paper]
  • HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space Model. [9 June, 2024] [ArXiv, 2024]
    Hang Fu, Genyun Sun, Yinhe Li, Jinchang Ren, Aizhu Zhang, Cheng Jing, Pedram Ghamisi.
    [Paper] [Code]

1.2 Remote Sensing Image Classification

  • RSMamba: Remote Sensing Image Classification with State Space Model. [28 March, 2024] [ArXiv, 2024]
    Keyan Chen, Bowen Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, Zhenwei Shi.
    [Paper]
  • SpectralMamba: Efficient Mamba for Hyperspectral Image Classification. [12 April, 2024] [ArXiv, 2024]
    Jing Yao, Danfeng Hong, Chenyu Li, Jocelyn Chanussot.
    [Paper] [Code]
  • Spectral-Spatial Mamba for Hyperspectral Image Classification. [29 Apr, 2024] [ArXiv, 2024]
    Lingbo Huang, Yushi Chen, Xin He.
    [Paper] [Code]
  • S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification. [28 April, 2024] [ArXiv, 2024]
    Guanchun Wang, Xiangrong Zhang, Zelin Peng, Tianyang Zhang, Xiuping Jia, Licheng Jiao.
    [Paper] [Code]
  • Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification. [25 May, 2024] [ArXiv, 2024]
    Weilian Zhou, Sei-Ichiro Kamata, Haipeng Wang, Man-Sing Wong, Huiying, Hou.
    [Paper] [Code]
  • SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients. [5 May, 2024] [ArXiv, 2024]
    Tushar Verma, Jyotsna Singh, Yash Bhartari, Rishi Jarwal, Suraj Singh, Shubhkarman Singh.
    [Paper] [Code]
  • GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification. [11 July, 2024] [ArXiv, 2024]
    Aitao Yang, Min Li, Yao Ding, Leyuan Fang, Yaoming Cai, Yujie He.
    [Paper] [Code]

1.3 Remote Sensing Image Change Detection

  • ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model. [14 April, 2024] [ArXiv, 2024]
    Hongruixuan Chen, Jian Song, Chengxi Han, Junshi Xia, Naoto Yokoya.
    [Paper] [Code]
  • RSCaMa: Remote Sensing Image Change Captioning with State Space Model. [2 May, 2024] [ArXiv, 2024]
    Chenyang Liu, Keyan Chen, Bowen Chen, Haotian Zhang, Zhengxia Zou, Zhenwei Shi.
    [Paper] [Code]
  • CDMamba: Remote Sensing Image Change Detection with Mamba. [6 June, 2024] [ArXiv, 2024]
    Haotian Zhang, Keyan Chen, Chenyang Liu, Hao Chen, Zhengxia Zou, Zhenwei Shi.
    [Paper] [Code]
  • A Mamba-based Siamese Network for Remote Sensing Change Detection. [8 July, 2024] [ArXiv, 2024]
    Jay N. Paranjape, Celso de Melo, Vishal M. Patel.
    [Paper] [Code]

1.4 Remote Sensing Image Segmentation

  • Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model. [11 April, 2024] [ArXiv, 2024]
    Qinfeng Zhu, Yuanzhi Cai, Yuan Fang, Yihan Yang, Cheng Chen, Lei Fan, Anh Nguyen.
    [Paper] [Code]
  • RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation. [3 April, 2024] [ArXiv, 2024]
    Xianping Ma, Xiaokang Zhang, Man-On Pun.
    [Paper] [Code]
  • RS-Mamba for Large Remote Sensing Image Dense Prediction. [10 April, 2024] [ArXiv, 2024]
    Sijie Zhao, Hao Chen, Xueliang Zhang, Pengfeng Xiao, Lei Bai, Wanli Ouyang.
    [Paper] [Code]
  • Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study. [14 May, 2024] [ArXiv, 2024]
    Qinfeng Zhu, Yuan Fang, Yuanzhi Cai, Cheng Chen, Lei Fan.
    [Paper]
  • CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation. [17 May, 2024] [ArXiv, 2024]
    Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li.
    [Paper] [Code]
  • PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery. [16 June, 2024] [ArXiv, 2024]
    Libo Wang, Dongxu Li, Sijun Dong, Xiaoliang Meng, Xiaokang Zhang, Danfeng Hong.
    [Paper] [Code]

1.5 Remote Sensing Image Fusion

  • FusionMamba: Efficient Image Fusion with State Space Model. [11 April, 2024] [ArXiv, 2024]
    Siran Peng, Xiangyu Zhu, Haoyu Deng, Zhen Lei, Liang-Jian Deng.
    [Paper]
  • A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion. [14 April, 2024] [ArXiv, 2024]
    Zihan Cao, Xiao Wu, Liang-Jian Deng, Yu Zhong.
    [Paper]
  • DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing. [10 July, 2024] [ArXiv, 2024]
    Minghang Zhou, Tianyu Li, Chaofan Qiao, Dongyu Xie, Guoqing Wang, Ningjuan Ruan, Lin Mei, Yang Yang.
    [Paper] [Code]

2 Medical Image

2.1 Medical Image Segmentation

2.1.1 Preliminary explorations of U-shaped Mamba
  • U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation. [9 January, 2024] [ArXiv, 2024]
    Jun Ma, Feifei Li, Bo Wang.
    [Paper] [Homepage] [Code]
  • VM-UNet: Vision Mamba UNet for Medical Image Segmentation. [4 February, 2024] [ArXiv, 2024]
    Jiacheng Ruan, Suncheng Xiang.
    [Paper] [Code]
  • Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation. [30 March, 2024] [ArXiv, 2024]
    Ziyang Wang, Jian-Qing Zheng, Yichi Zhang, Ge Cui, Lei Li.
    [Paper] [Code]
  • Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining. [6 March, 2024] [ArXiv, 2024]
    Jiarun Liu, Hao Yang, Hong-Yu Zhou, Yan Xi, Lequan Yu, Yizhou Yu, Yong Liang, Guangming Shi, Shaoting Zhang, Hairong Zheng, Shanshan Wang.
    [Paper] [Code]
2.1.2 Improvements to the U-shaped Mamba
  • LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation. [11 March, 2024] [ArXiv, 2024]
    Weibin Liao, Yinghao Zhu, Xinyuan Wang, Chengwei Pan, Yasha Wang, Liantao Ma.
    [Paper] [Code]
  • VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation . [14 March, 2024] [ArXiv, 2024]
    Mingya Zhang, Yue Yu, Limei Gu, Tingsheng Lin, Xianping Tao.
    [Paper] [Code]
  • Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention. [12 March, 2024] [ArXiv, 2024]
    Jinhong Wang, Jintai Chen, Danny Chen, Jian Wu.
    [Paper] [Code]
  • H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation. [20 March, 2024] [ArXiv, 2024]
    Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang.
    [Paper] [Code]
  • Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion. [26 Mar, 2024] [ArXiv, 2024]
    Kazi Shahriar Sanjid, Md. Tanzim Hossain, Md. Shakib Shahariar Junayed, Dr. Mohammad Monir Uddin.
    [Paper]
  • Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation. [16 April, 2024] [ArXiv, 2024]
    Hao Tang, Lianglun Cheng, Guoheng Huang, Zhengguang Tan, Junhao Lu, Kaihong Wu.
    [Paper]
  • UltraLight VM-UNet: Parallel Vision Mamba Significantly Reduces Parameters for Skin Lesion Segmentation. [24 April, 2024] [ArXiv, 2024]
    Renkai Wu, Yinghao Liu, Pengchen Liang, Qing Chang.
    [Paper] [Code]
  • AC-MAMBASEG: An adaptive convolution and Mamba-based architecture for enhanced skin lesion segmentation. [5 May, 2024] [ArXiv, 2024]
    Xiaoyan Lei, Wenlong Zhang, Weifeng Cao.
    [Paper] [Code]
  • MUCM-Net: A Mamba Powered UCM-Net for Skin Lesion Segmentation. [24 May, 2024] [ArXiv, 2024]
    Chunyu Yuan, Dongfang Zhao, Sos S. Agaian.
    [Paper] [Code]
  • Convolution and Attention-Free Mamba-based Cardiac Image Segmentation. [9 June, 2024] [ArXiv, 2024]
    Abbas Khan, Muhammad Asad, Martin Benning, Caroline Roney, Gregory Slabaugh.
    [Paper]
  • MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba. [9 June, 2024] [ArXiv, 2024]
    Zhongping Ji.
    [Paper] [Code]
  • HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation. [11 May, 2024] [ArXiv, 2024]
    Jiashu Xu.
    [Paper]
  • SliceMamba for Medical Image Segmentation. [11 July, 2024] [ArXiv, 2024]
    Chao Fan, Hongyuan Yu, Luo Wang, Yan Huang, Liang Wang, Xibin Jia.
    [Paper]
2.1.3 U-shaped Mamba with other methodologies
  • Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation. [29 March, 2024] [ArXiv, 2024]
    Chao Ma, Ziyang Wang.
    [Paper] [Code]
  • Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation. [16 February, 2024] [ArXiv, 2024]
    Ziyang Wang, Chao Ma.
    [Paper] [Code]
  • ProMamba: Prompt-Mamba for polyp segmentation. [26 March, 2024] [ArXiv, 2024]
    Jianhao Xie, Ruofan Liao, Ziang Zhang, Sida Yi, Yuesheng Zhu, Guibo Luo.
    [Paper]
  • P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation. [15 March, 2024] [ArXiv, 2024]
    Zi Ye, Tianxiang Chen, Fangyijie Wang, Hanwei Zhang, Guanxi Li, Lijun Zhang.
    [Paper]
2.1.4 Multi-Dimensional Medical Data Segmentation
  • SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation. [25 February, 2024] [ArXiv, 2024]
    Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, Lei Zhu.
    [Paper] [Code]
  • nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model. [10 March, 2024] [ArXiv, 2024]
    Haifan Gong, Luoyao Kang, Yitao Wang, Xiang Wan, Haofeng Li.
    [Paper] [Code]
  • T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation. [1 April, 2024] [ArXiv, 2024]
    Jing Hao, Lei He, Kuo Feng Hung.
    [Paper] [Code]
  • Vivim: a Video Vision Mamba for Medical Video Object Segmentation. [12 March, 2024] [ArXiv, 2024]
    Yijun Yang, Zhaohu Xing, Chunwang Huang, Lei Zhu.
    [Paper] [Code]

2.2 Pathological Diagnosis

  • MedMamba: Vision Mamba for Medical Image Classification. [2 April, 2024] [ArXiv, 2024]
    Yubiao Yue, Zhenzhang Li.
    [Paper] [Code]
  • MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models. [8 March, 2024] [ArXiv, 2024]
    Zijie Fang, Yifeng Wang, Zhi Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang.
    [Paper]
  • MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology. [11 March, 2024] [ArXiv, 2024]
    Shu Yang, Yihui Wang, Hao Chen.
    [Paper] [Code]
  • CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification. [25 March, 2024] [ArXiv, 2024]
    Guangqian Yang, Kangrui Du, Zhihan Yang, Ye Du, Yongping Zheng, Shujun Wang.
    [Paper]
  • SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction. [11 April, 2024] [ArXiv, 2024]
    Ying Chen, Jiajing Xie, Yuxiang Lin, Yuhang Song, Wenxian Yang, Rongshan Yu.
    [Paper]
  • Cardiovascular Disease Detection from Multi-View Chest X-rays with BI-Mamba. [28 May, 2024] [ArXiv, 2024]
    Zefan Yang, Jiajin Zhang, Ge Wang, Mannudeep K. Kalra, Pingkun Yan.
    [Paper]
  • MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging. [2 June, 2024] [ArXiv, 2024]
    Jiaying Zhou, Mingzhou Jiang, Junde Wu, Jiayuan Zhu, Ziyue Wang, Yueming Jin.
    [Paper]
  • GFE-Mamba: Mamba-based AD Multi-modal Progression Assessment via Generative Feature Extraction from MCI. [22 July, 2024] [ArXiv, 2024]
    Zhaojie Fang, Shenghao Zhu, Yifei Chen, Binfeng Zou, Fan Jia, Linwei Qiu, Chang Liu, Yiyu Huang, Xiang Feng, Feiwei Qin, Changmiao Wang, Yeru Wang, Jin Fan, Changbiao Chu, Wan-Zhen Wu, Hu Zhao.
    [Paper] [Code]

2.3 Deformable Image Registration

  • MambaMorph: a Mamba-based Framework for Medical MR-CT Deformable Registration. [12 March, 2024] [ArXiv, 2024]
    Tao Guo, Yinuo Wang, Shihao Shu, Diansheng Chen, Zhouping Tang, Cai Meng, Xiangzhi Bai.
    [Paper] [Code]
  • VMambaMorph: a Visual Mamba-based Framework with Cross-Scan Module for Deformable 3D Image Registration. [7 Apr, 2024] [ArXiv, 2024]
    Ziyang Wang, Jian-Qing Zheng, Chao Ma, Tao Guo.
    [Paper] [Code]

2.4 Medical Image Reconstruction

  • FD-Vision Mamba for Endoscopic Exposure Correction. [14 February, 2024] [ArXiv, 2024]
    Zhuoran Zheng, Jun Zhang.
    [Paper] [Code]
  • MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation. [19 March, 2024] [ArXiv, 2024]
    Jiahao Huang, Liutao Yang, Fanwen Wang, Yinzhe Wu, Yang Nan, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang.
    [Paper] [Code]
  • FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba. [20 April, 2024] [ArXiv, 2024]
    Xinyu Xie, Yawen Cui, Chio-In Ieong, Tao Tan, Xiaozhi Zhang, Xubin Zheng, Zitong Yu.
    [Paper] [Code]
  • MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion. [12 April, 2024] [ArXiv, 2024]
    Zhe Li, Haiwei Pan, Kejia Zhang, Yuhua Wang, Fengming Yu.
    [Paper]
  • Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba. [27 May, 2024] [ArXiv, 2024]
    Jiahao Huang, Liutao Yang, Fanwen Wang, Yinzhe Wu, Yang Nan, Weiwen Wu, Chengyan Wang, Kuangyu Shi, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang.
    [Paper]
  • MMR-Mamba: Multi-Modal MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion. [27 June, 2024] [ArXiv, 2024]
    Jing Zou, Lanqing Liu, Qi Chen, Shujun Wang, Xiaohan Xing, Jing Qin.
    [Paper]
  • Deform-Mamba Network for MRI Super-Resolution. [8 July, 2024] [ArXiv, 2024]
    Zexin Ji, Beiji Zou, Xiaoyan Kui, Pierre Vera, Su Ruan.
    [Paper]

2.5 Other Medical Tasks

  • MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction. [13 March, 2024] [ArXiv, 2024]
    Linjie Fu, Xia Li, Xiuding Cai, Yingkai Wang, Xueyao Wang, Yali Shen, Yu Yao.
    [Paper] [Code]

  • Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy. [20 April, 2024] [ArXiv, 2024]
    Yuelin Zhang, Wanquan Yan, Kim Yan, Chun Ping Lam, Yufu Qiu, Pengyu Zheng, Raymond Shing-Yan Tang, Shing Shin Cheng.
    [Paper] [Code]

  • VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis. [9 May, 2024] [ArXiv, 2024]
    Zhihan Ju, Wanting Zhou.
    [Paper]

  • I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling. [22 May, 2024] [ArXiv, 2024]
    Omer F. Atli, Bilal Kabas, Fuat Arslan, Mahmut Yurt, Onat Dalmaz, Tolga Çukur.
    [Paper] [Code]

  • Soft Masked Mamba Diffusion Model for CT to MRI Conversion. [22 June, 2024] [ArXiv, 2024]
    Zhenbin Wang, Lei Zhang, Lituan Wang, Zhenwei Zhang.
    [Paper] [Code]

  • SR-Mamba: Effective Surgical Phase Recognition with State Space Model. [11 July, 2024] [ArXiv, 2024]
    Rui Cao, Jiangliu Wang, Yun-Hui Liu.
    [Paper] [Code]

Other Domains

coming soon

About

Vision Mamba: A Comprehensive Survey and Taxonomy