Zexuan Zhong's starred repositories
flash-attention
Fast and memory-efficient exact attention
DeepSeek-Coder
DeepSeek Coder: Let the Code Write Itself
annotated_latex_equations
Examples of how to create colorful, annotated equations in Latex using Tikz.
state-spaces
Sequence Modeling with Structured State Spaces
performer-pytorch
An implementation of Performer, a linear attention-based transformer, in Pytorch
RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
contriever
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
annotated-s4
Implementation of https://srush.github.io/annotated-s4
nn_pruning
Prune a model while finetuning or training.
world-models
Extracting spatial and temporal world models from LLMs
CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408
OptiPrompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240
EntityQuestions
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers https://arxiv.org/abs/2109.08535
DinkyTrain
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃