Jianyu Huang's repositories
CS378_PfCandP
CS378 Programming for Correctness and Performance
effectivepython
Effective Python: Second Edition — Source Code and Errata for the Book
flash-attention
Fast and memory-efficient exact attention
friendLunarBirthday
generate "csv" format data of my friends' Chinese lunar Birthday for Google Calendar import
neon
Nervana's python based Deep Learning Framework
param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.
torchrec-1
Pytorch domain library for recommendation systems
torchrec-3
Pytorch domain library for recommendation systems
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.