There are 5 repositories under quantization-aware-training topic.
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Neural Network Compression Framework for enhanced OpenVINO™ inference
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
YOLO ModelCompression MultidatasetTraining
A model compression and acceleration toolbox based on pytorch.
Tutorial notebooks for hls4ml
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture
针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库
This repository contains notebooks that show the usage of TensorFlow Lite for quantizing deep neural networks.
OpenVINO Training Extensions Object Detection
Notes on quantization in neural networks
Quantization Aware Training
Quantization-aware training with spiking neural networks
3rd place solution for NeurIPS 2019 MicroNet challenge
FakeQuantize with Learned Step Size(LSQ+) as Observer in PyTorch
Code for paper 'Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware'
A tutorial of model quantization using TensorFlow
Code for the ISCAS23 paper "The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks"
BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks, ECCV 2022
Autoencoder model for FPGA implementation using hls4ml. Repository for Applied Electronics Project.
An example to quantize MobileNetV2 trained on CIFAR-10 dataset with PyTorch FX graph mode quantization
all methods of pytorch quantization based on resnet50
Model Quantization with Pytorch, Tensorflow & Larq
Submission name: QualcommAI-EfficientNet. MicroNet Challenge (NeurIPS 2019) submission - Qualcomm AI Research
quantization example for pqt & qat
Multi-Domain Balanced Sampling Improves Out-of-Distribution Generalization of Chest X-ray Pathology Prediction Models
Disentangle joint continous and discrete representations for Anomaly Detection in High Energy Physics.
One Bit at a Time: Impact of Quantisation on Neural Machine Translation
Low-Precision Neural Networks for Classification on PYNQ with FINN
Comprehensive study on the quantization of various CNN models, employing techniques such as Post-Training Quantization and Quantization Aware Training (QAT).
Quantization Aware Training