AinL's repositories
sharkshark-4k
Upscale Twitch stream and restream into Twitch or RTMP or File.
sea-attention
Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)
streaming-llm-triton
OpenAI Triton Implementation of Streaming LLM
cs454-project
CS454 2023 F Team 4
pypareto-native
Numba optimized version of `pypareto`. Sorting chains for pareto frontier extraction
EXAONE-3.5
Official repository for EXAONE 3.5 built by LG AI Research
gmlwns2000.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
hip-attention
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
InfiniteBench-hip
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
lm-evaluation-harness
A framework for few-shot evaluation of language models.
loft-hip
LOFT: A 1 Million+ Token Long-Context Benchmark
LongBench-hip
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
LongLM
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
sglang-hip12
SGLang is a fast serving framework for large language models and vision language models. See hip12-offload-add-offload-cache
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
triton-fix-autotune
Development repository for the Triton language and compiler
vllm-timber
A high-throughput and memory-efficient inference and serving engine for LLMs