SSMs and related works list

About

A list for SSMs and related works.

List for SSMs

Number	SSM	Paper	Code	Conference or Journal	URL
1	HiPPO	HiPPO: Recurrent Memory with Optimal Polynomial Projections	https://github.com/state-spaces/s4	NeurIPS 2020	https://proceedings.neurips.cc/paper/2020/hash/102f0bb6efb3a6128a3c750dd16729be-Abstract.html
2	LSSL	Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers	https://github.com/state-spaces/s4	NeurIPS 2021	https://openreview.net/forum?id=yWd42CWN3c
3	S4	Efficiently Modeling Long Sequences with Structured State Spaces	https://github.com/state-spaces/s4	ICLR 2022	https://openreview.net/forum?id=uYLFoz1vlAC
4	DSS	Diagonal State Spaces are as Effective as Structured State Spaces	https://github.com/ag1988/dss	NeurIPS 2022	https://openreview.net/forum?id=RjS0j6tsSrf
5	S4D	On the Parameterization and Initialization of Diagonal State Space Models	https://github.com/state-spaces/s4	NeurIPS 2022	https://openreview.net/forum?id=yJE7iQSAep
6	Generalized HiPPO	How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections	https://github.com/state-spaces/s4	ICLR 2023	https://openreview.net/forum?id=klK17OQ3KB
7	GSS	Long Range Language Modeling via Gated State Spaces		ICLR 2023	https://openreview.net/forum?id=5MkYIYCbva
8	Liquid S4	Liquid Structural State-Space Models	https://github.com/raminmh/liquid-s4	ICLR 2023	https://openreview.net/forum?id=g4OTKRKfS7R
9	S5	Simplified State Space Layers for Sequence Modeling	https://github.com/lindermanlab/S5	ICLR 2023	https://openreview.net/forum?id=Ai8Hw3AXqks
10	H3	Hungry Hungry Hippos: Towards Language Modeling with State Space Models	https://github.com/HazyResearch/H3	ICLR 2023	https://openreview.net/forum?id=COZDy0WYGg
11	S4-PTD and S5-PTD	Robustifying State-space Models for Long Sequences via Approximate Diagonalization		ICLR 2024	https://openreview.net/forum?id=DjeQ39QoLQ
12	S6	Mamba: Linear-Time Sequence Modeling with Selective State Spaces	https://github.com/state-spaces/mamba		https://arxiv.org/abs/2312.00752
13	STU	Spectral State Space Models	https://github.com/catid/spectral_ssm		https://arxiv.org/abs/2312.06837
14	Mamba 2	Transformers are SSMs: Generalized Models and Efficient Algorithms with Structured State Space Duality		ICML 2024

List for Linear RNNs (LRNNs)

Number	LRNN	Paper	Code	Conference or Journal	URL
1	CKConv	CKConv: Continuous Kernel Convolution For Sequential Data	https://github.com/dwromero/ckconv	ICLR 2021	https://openreview.net/forum?id=8FhxBtXSl0
2	FlexConv	FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes	https://github.com/rjbruin/flexconv	ICLR 2022	https://openreview.net/forum?id=3jooF27-0Wy
3	DLR	Simplifying and Understanding State Space Models with Diagonal Linear RNNs	https://github.com/ag1988/dlr		https://arxiv.org/abs/2212.00768
4	CCNN	Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN	https://github.com/david-knigge/ccnn	ICLR 2023	https://openreview.net/forum?id=ZW5aK4yCRqU
5	SGConv	What Makes Convolutional Models Great on Long Sequence Modeling?	https://github.com/ctlllll/SGConv	ICLR 2023	https://openreview.net/forum?id=TGJSPbRpJX-
6	Mega	Mega: Moving Average Equipped Gated Attention	https://github.com/facebookresearch/mega	ICLR 2023	https://openreview.net/forum?id=qNLe3iq2El
7	TNN	Toeplitz Neural Network for Sequence Modeling	https://github.com/Doraemonzzz/tnn-pytorch	ICLR 2023	https://openreview.net/forum?id=IxmWsm4xrua
8	Hyena	Hyena Hierarchy: Towards Larger Convolutional Language Models	https://github.com/hazyresearch/safari	ICML 2023	https://proceedings.mlr.press/v202/poli23a.html
9	MultiresNet	Sequence Modeling with Multiresolution Convolutional Memory	https://github.com/thjashin/multires-conv	ICML 2023	https://proceedings.mlr.press/v202/shi23f.html
10	LRU	Resurrecting Recurrent Neural Networks for Long Sequences		ICML 2023	https://proceedings.mlr.press/v202/orvieto23a.html
11	RWKV v4 (Dove)	RWKV: Reinventing RNNs for the Transformer Era	https://github.com/BlinkDL/RWKV-LM	EMNLP 2023	https://aclanthology.org/2023.findings-emnlp.936/
12	RetNet	Retentive Network: A Successor to Transformer for Large Language Models	https://github.com/microsoft/torchscale		https://arxiv.org/abs/2307.08621
13	MultiHyena	Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions		NeurIPS 2023	https://openreview.net/forum?id=OWELckerm6
14	Monarch Mixer	Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture	https://github.com/HazyResearch/m2	NeurIPS 2023	https://openreview.net/forum?id=cB0BImqSS9
15	SeqBoat	Sparse Modular Activation for Efficient Sequence Modeling	https://github.com/renll/SeqBoat	NeurIPS 2023	https://openreview.net/forum?id=TfbzX6I14i
16	HGRN	Hierarchically Gated Recurrent Neural Network for Sequence Modeling	https://github.com/OpenNLPLab/HGRN	NeurIPS 2023	https://openreview.net/forum?id=P1TCHxJwLB
17	GLA Transformer	Gated Linear Attention Transformers with Hardware-Efficient Training	https://github.com/sustcsonglin/flash-linear-attention		https://arxiv.org/abs/2312.06635
18	Orchid	Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling			https://arxiv.org/abs/2402.18508
19	RWKV v5 (Eagle) and v6 (Finch)	Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence	https://huggingface.co/RWKV		https://arxiv.org/abs/2404.05892
20	HGRN2	HGRN2: Gated Linear RNNs with State Expansion	https://github.com/OpenNLPLab/HGRN2		https://arxiv.org/abs/2404.07904

List for Surveys

Number	Paper	Journal or Conference	URL
1	A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies		https://arxiv.org/abs/2302.06218
2	State Space Model for New-Generation Network Alternative to Transformers: A Survey		https://arxiv.org/abs/2404.09516