Zhubo Shi's starred repositories
the-art-of-command-line
Master the command line, in one page
llm-action
本项目旨在分享大模型相关技术原理以及实战经验。
speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
MediaCrawler
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
Awesome-LLMs-on-device
Awesome LLMs on Device: A Comprehensive Survey
text-generation-inference
Large Language Model Text Generation Inference
BigLittleDecoder
[NeurIPS'23] Speculative Decoding with Big Little Decoder
LookaheadDecoding
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Spec-Bench
Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)
SpeculativeDecodingPapers
📰 Must-read papers and blogs on Speculative Decoding ⚡️
prompt-cache
Modular and structured prompt caching for low-latency LLM inference
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.