Zhenyu (Allen) Zhang's starred repositories
stochastorch
A Pytorch implementation of stochastic addition.
LLM-Adapters
Code for our EMNLP 2023 Paper: "LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models"
infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
searchformer
Official codebase for the paper "Beyond A* Better Planning with Transformers via Search Dynamics Bootstrapping".
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Usage-of-the-8bit-Quantization-in-Neural-Network-Training
This repo has the script to reproduce the experiments in project 'Usage of the 8bit Quantization in Neural Network Training'.
improved-t5
Experiments for efforts to train a new and improved t5
CUDA-Learn-Notes
🎉 Modern CUDA Learn Notes with PyTorch: fp32, fp16, bf16, fp8/int8, flash_attn, sgemm, sgemv, warp/block reduce, dot, elementwise, softmax, layernorm, rmsnorm.
mistral-inference
Official inference library for Mistral models
recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
schedule_free
Schedule-Free Optimization in PyTorch