There are 0 repository under ptq topic.
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam,dcn and so on), and tensorrt
Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
Deep Learning Model Optimization Using by TensorRT API, window
[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
quantization example for pqt & qat
Generating tensorrt model using onnx
inference with the structured sparsity and quantization
Build AI model to classify beverages for blind individuals
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
LLM quantization techniques: absmax, zero-point, GPTQ and GGUF
Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)