Yinghai Lu's repositories
AITemplateOSS
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Language:PythonApache-2.0000
cutlass
CUDA Templates for Linear Algebra Subroutines
Language:C++NOASSERTION000
onnx-fb-universe
ONNX Integration Builds
onnx-tensorrt
ONNX-TensorRT: TensorRT backend for ONNX
TensorComprehensions
A domain specific language to express machine learning workloads.
tensorflow
Computation using data flow graphs for scalable machine learning
Torch-TensorRT
PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT
Language:Jupyter NotebookBSD-3-Clause000
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Language:PythonNOASSERTION000