There are 1 repository under long-context topic.
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
Transformers with Arbitrarily Large Context
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
LongQLoRA: Extent Context Length of LLMs Efficiently
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
LooGLE: Long Context Evaluation for Long-Context Language Models
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
The official repo for "LLoCo: Learning Long Contexts Offline"
Implementation of Infini-Transformer in Pytorch
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
Implementation of paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
[DEPRECIATED] Very fast, large music transformer with 8k sequence length, efficient heptabit MIDI notes encoding, true full MIDI instruments range, chords counters and outro tokens
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.
Papers of Long Context Language Model
Streamlined variant of Long-Range Arena with pinned dependencies, automated data downloads, and deterministic shuffling.
needle in a haystack for LLMs
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Finetuning and evaluating LLMs to extract GHG emissions from PDF reports using RAG and grammar-based decoding.