Yudi Zhang's starred repositories
RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
llama-recipes
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.
ml-engineering
Machine Learning Engineering Open Book
OpenAgents
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
MirrorSite
镜像网站合集
the-art-of-debugging
The Art of Debugging
ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
Transformer-M
[ICLR 2023] One Transformer Can Understand Both 2D & 3D Molecular Data (official implementation)
CapsFusion
[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale
Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
linear_open_lm
A repository for research on medium sized language models.