There are 23 repositories under self-attention topic.
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
The GitHub repository for the paper "Informer" accepted by AAAI 2021.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Pytorch implementation of the Graph Attention Network model by Veličković et. al (2017, https://arxiv.org/abs/1710.10903)
My implementation of the original GAT paper (Veličković et al.). I've additionally included the playground.py file for visualizing the Cora dataset, GAT embeddings, an attention mechanism, and entropy histograms. I've supported both Cora (transductive) and PPI (inductive) examples!
Datasets, tools, and benchmarks for representation learning of code.
[CVPR 2025] Official PyTorch Implementation of MambaVision: A Hybrid Mamba-Transformer Vision Backbone
CCNet: Criss-Cross Attention for Semantic Segmentation (TPAMI 2020 & ICCV 2019).
Recent Transformer-based CV and related works.
Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.
list of efficient attention modules
Pre-training of Deep Bidirectional Transformers for Language Understanding: pre-train TextCNN
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
Text classification using deep learning models in Pytorch
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Universal Graph Transformer Self-Attention Networks (TheWebConf WWW 2022) (Pytorch and Tensorflow)
A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"
A Structured Self-attentive Sentence Embedding
The official PyTorch implementation of the paper "SAITS: Self-Attention-based Imputation for Time Series". A fast and state-of-the-art (SOTA) deep-learning neural network model for efficient time-series imputation (impute multivariate incomplete time series containing NaN missing data/values with machine learning). https://arxiv.org/abs/2202.08516
Implementing Stand-Alone Self-Attention in Vision Models using Pytorch
DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image
Keras, PyTorch, and NumPy Implementations of Deep Learning Architectures for NLP
Trainable fast and memory-efficient sparse attention
[ECCV 2024] Official PyTorch implementation of RoPE-ViT "Rotary Position Embedding for Vision Transformer"
[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity
The official repo for [TPAMI'25] "HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model"
Representation learning on dynamic graphs using self-attention networks
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)
Awesome Transformers (self-attention) in Computer Vision