Beast code in Giters

zeroine's repositories

aimet

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.

Language:PythonNOASSERTION000

flash-attention

Fast and memory-efficient exact attention

Language:PythonBSD-3-Clause000

MQBench

Model Quantization Benchmark

Language:ShellApache-2.0000

ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

Language:PythonApache-2.0000

NVIDIA® TensorRT™, an SDK for high-performance deep learning inference, includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for inference applications.

Language:C++Apache-2.0000

zeroine

zeroine's repositories

cutlass-cute-sample

aimet

flash-attention

MQBench

ppq

silkflow

TensorRT