There are 5 repositories under attention-mechanisms topic.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch
Implementation of Alphafold 3 from Google Deepmind in Pytorch
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
Implementation of MeshGPT, SOTA Mesh generation using Attention, in Pytorch
Implementation of Phenaki Video, which uses Mask GIT to produce text guided videos of up to 2 minutes in length, in Pytorch
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
PyTorch Dual-Attention LSTM-Autoencoder For Multivariate Time Series
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Implementation of MagViT2 Tokenizer in Pytorch
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
Unofficial implementation of iTransformer - SOTA Time Series Forecasting using Attention networks, out of Tsinghua / Ant group
🦖Pytorch implementation of popular Attention Mechanisms, Vision Transformers, MLP-Like models and CNNs.🔥🔥🔥
An implementation of local windowed attention for language modeling
Implementation of a single layer of the MMDiT, proposed in Stable Diffusion 3, in Pytorch
Implementation of RT1 (Robotic Transformer) in Pytorch
Official PyTorch Implementation for "Rotate to Attend: Convolutional Triplet Attention Module." [WACV 2021]
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
Implementation of Q-Transformer, Scalable Offline Reinforcement Learning via Autoregressive Q-Functions, out of Google Deepmind
Explorations into training LLMs to use clinical calculators from patient history, using open sourced models. Will start with Wells' Criteria
Implementation of the Equiformer, SE3/E3 equivariant attention network that reaches new SOTA, and adopted for use by EquiFold for protein folding
Neat (Neural Attention) Vision, is a visualization tool for the attention mechanisms of deep-learning models for Natural Language Processing (NLP) tasks. (framework-agnostic)
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
Sparse and structured neural attention mechanisms
Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPT
Implementation of Block Recurrent Transformer - Pytorch
Learning YOLOv3 from scratch 从零开始学习YOLOv3代码
Implementation of Flash Attention in Jax
Implementation of fused cosine similarity attention in the same style as Flash Attention
Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling
Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch