Yahya Yang's repositories
yyh-kaggle
kaggle
flash-attention
Fast and memory-efficient exact attention
BSD-3-Clause000
CTranslate2
Fast inference engine for Transformer models
MIT000
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Apache-2.0000
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Apache-2.0000
tuning_playbook_zh_cn
一本系统地教你将深度学习模型的性能最大化的战术手册。
NOASSERTION000
math_pyside2
math_pyside2