leizhao1234's repositories
cute-gemm
cute-gemm
Language:C++000
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++NOASSERTION000
FasterTransformer
Transformer related optimization, including BERT, GPT
Language:C++Apache-2.0000
flash-attention
Fast and memory-efficient exact attention
Language:PythonBSD-3-Clause000
Megatron-LM
Ongoing research training transformer models at scale
Language:PythonNOASSERTION000
SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
Language:PythonApache-2.0000
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:PythonApache-2.0000