long-context

There are 2 repositories under long-context topic.

InternLM / InternLM
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
chatbot chinese fine-tuning-llm flash-attention gpt large-language-model llm long-context pretrained-models rlhf
Language:Python 7055
dvlab-research / LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
fine-tuning-llm large-language-models long-context llm lora
Language:Python 2685
THUDM / LongWriter
[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
fine-tuning llm long-context long-text
Language:Python 1718
THUDM / LongBench
LongBench v2 and LongBench (ACL 25'&24')
benchmark llm long-context longtext
Language:Python 967
haoliuhl / ringattention
Large Context Attention
large-language-models long-context memory-efficient transformers
Language:Python 699
lucidrains / MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
artificial-intelligence attention-mechanisms deep-learning learned-tokenization long-context transformers
Language:Python 640
lucidrains / ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
attention-mechanism efficient-attention long-context distributed-attention
Language:Python 537
THUDM / LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
benchmark citation-generation fine-tuning llm long-context
Language:Python 503
NVIDIA / kvpress
LLM KV cache compression made easy
inference kv-cache kv-cache-compression large-language-models llm long-context python pytorch transformers
Language:Python 452
lucidrains / recurrent-memory-transformer-pytorch
Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch
artificial-intelligence attention-mechanisms deep-learning transformers long-context memory recurrence
Language:Python 407
thunlp / InfLLM
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
large-language-models llm long-context training-free
Language:Python 348
OpenBMB / InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
benchmark large-language-models long-context
Language:Python 347
dingo-actual / infini-transformer
PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)
attention-mechanism deep-learning infini-attention long-context mixture-of-depths pytorch transformers
Language:Python 288
VITA-MLLM / Long-VITA
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
long-context mllm vision-language-model
Language:Python 268
Infini-AI-Lab / TriForce
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
acceleration llm long-context speculative-decoding llm-inference efficiency inference
Language:Python 262
THUDM / LongAlign
[EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs
alignment llm long-context longtext
Language:Python 256
metame-ai / awesome-llm-plaza
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
awesome llm awesome-code-llm awesome-llm-plaza awesome-robotics-llm awesome-rlhf long-context awesome-llm-agents awesome-llm-prompt awesome-llm-reasoning awesome-llm-security awesome-multimod-llm llm-application
209
nightdessert / Retrieval_Head
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
large-language-models long-context
Language:Python 208
bigai-nlco / LooGLE
ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models
acl2024 large-language-models llm long-context
Language:Python 181
yangjianxin1 / LongQLoRA
LongQLoRA: Extent Context Length of LLMs Efficiently
llm long-context longlora lora qlora
Language:Python 166
bytedance / ShadowKV
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
cpu-offload high-throughput llm-inference long-context low-rank research sparse-attention
Language:Python 154
Glaciohound / LM-Infinite
Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"
language-model long-context model-diagnostics
Language:Python 140
jeffreysijuntan / lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
context-compression finetune llm long-context pytorch
Language:Python 116
OpenGVLab / MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.
benchmark long-context multimodal-large-language-models vision-language-model
Language:Python 114
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
artificial-intelligence attention-mechanism deep-learning long-context memory transformers
Language:Python 110
QingFei1 / LongRAG
[EMNLP 2024] LongRAG: A Dual-perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering
llm long-context rag
Language:Python 102
lucidrains / perceiver-ar-pytorch
Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
artficial-intelligence attention-mechanism deep-learning transformer long-context
Language:Python 87
nick7nlp / Counting-Stars
Counting-Stars (★)
evaluation-metrics large-language-model long-context
Language:Jupyter Notebook 82
bigai-nlco / VideoLLaMB
Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges
long-context video-language-pretraining video-language-understanding
Language:Python 67
X-PLUG / WritingBench
WritingBench: A Comprehensive Benchmark for Generative Writing
ai benchmark evaluation-framework huggingface llm long-context long-text nlp text-generation writing
Language:Python 63
open-compass / Ada-LEval
The official implementation of "Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks"
gpt4 llm long-context
Language:Python 53
lucidrains / flash-genomics-model
My own attempt at a long context genomics model, leveraging recent advances in long context attention modeling (Flash Attention + other hierarchical methods)
artificial-intelligence attention-mechanisms deep-learning genomics long-context transformers
Language:Python 52
dvlab-research / Q-LLM
This is the official repo of "QuickLLaMA: Query-aware Inference Acceleration for Large Language Models"
fast-inference inference-acceleration kv-cache-compression large-language-models long-context
Language:Python 46
OpenMOSS / Thus-Spake-Long-Context-LLM
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
architecture evaluation infrastructure large-language-model long-context paper-list survey training
45
leezythu / FocusLLM
FocusLLM: Scaling LLM’s Context by Parallel Decoding
efficient long-context pretrained-models
Language:Python 39
VITA-Group / Ms-PoE
"Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiwei Liu, Zhewei Yao, Olatunji Ruwase, Beidi Chen, Xiaoxia Wu, Zhangyang Wang.
large-language-models long-context lost-in-the-middle positional-encoding
Language:Python 29

long-context

InternLM / InternLM

dvlab-research / LongLoRA

THUDM / LongWriter

THUDM / LongBench

haoliuhl / ringattention

lucidrains / MEGABYTE-pytorch

lucidrains / ring-attention-pytorch

THUDM / LongCite

NVIDIA / kvpress

lucidrains / recurrent-memory-transformer-pytorch

thunlp / InfLLM

OpenBMB / InfiniteBench

dingo-actual / infini-transformer

VITA-MLLM / Long-VITA

Infini-AI-Lab / TriForce

THUDM / LongAlign

metame-ai / awesome-llm-plaza

nightdessert / Retrieval_Head

bigai-nlco / LooGLE

yangjianxin1 / LongQLoRA

bytedance / ShadowKV

Glaciohound / LM-Infinite

jeffreysijuntan / lloco

OpenGVLab / MM-NIAH

lucidrains / infini-transformer-pytorch

QingFei1 / LongRAG

lucidrains / perceiver-ar-pytorch

nick7nlp / Counting-Stars

bigai-nlco / VideoLLaMB

X-PLUG / WritingBench

open-compass / Ada-LEval

lucidrains / flash-genomics-model

dvlab-research / Q-LLM

OpenMOSS / Thus-Spake-Long-Context-LLM

leezythu / FocusLLM

VITA-Group / Ms-PoE