yuuxiaooqingg's starred repositories
ChunkLlama
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
tensor_parallel
Automatically split your PyTorch models on multiple GPUs for training & inference
token_visualizer
Token level visualization tools for large language models
st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
TransnormerLLM
Official implementation of TransNormerLLM: A Faster and Better LLM
mutransformers
some common Huggingface transformers in maximal update parametrization (µP)