Yixiao Yuan's starred repositories
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
awesome-llm-role-playing-with-persona
Awesome-llm-role-playing-with-persona: a curated list of resources for large language models for role-playing with assigned personas
llama3-Chinese-chat
Llama3 中文仓库(聚合资料,各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
wordcab-transcribe
💬 ASR FastAPI server using faster-whisper and Multi-Scale Auto-Tuning Spectral Clustering for diarization.
WhisperLive
A nearly-live implementation of OpenAI's Whisper.
SearchEngine
搜索引擎原理