There are 6 repositories under quantized-neural-networks topic.
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Generate a quantization parameter file for ncnn framework int8 inference
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
Mobilenet v1 trained on Imagenet for STM32 using extended CMSIS-NN with INT-Q quantization support
Some recent Quantizing techniques on PyTorch
Slides with modifications for a course at Tsinghua University.
This repository contains source code to binarize any real-value word embeddings into binary vectors.
This repository containts the pytorch scripts to train mixed-precision networks for microcontroller deployment, based on the memory contraints of the target device.
Binary neural networks developed by Huawei Noah's Ark Lab
Low Precision(quantized) Yolov5
Contains code for Binary, Ternary, N-bit Quantized and Hybrid CNNs for low precision experiments.
Code implementation of our AISTATS'21 paper "Mirror Descent View for Neural Network Quantization"
Mobilenet v1 (3,160,160, alpha=0.25, and 3,192,192, alpha=0.5) on STM32H7 using X-CUBE-AI v4.1.0
Efficient Neural Architecture Search coupled with Quantized CNNs to search for resource efficient and accurate architectures.
Modeling stuck-at faults for RRAM inference on popular neural networks after quantization
Code implementation of our AAAI'22 paper "Improved Gradient-Based Adversarial Attacks for Quantized Networks"
Quantized training using Keras
Exercises on HW acceleration of quantized neural networks for the course Integrated Systems Architecture at PoliTo
A python-based utility to convert a grayscale image into verilog code.
In this project, we have implemented the VQ-VAE algorithm on both MNIST and CIFAR10 datasets considering MSELOSS and also NLLLOSE.
Training neural nets with quantized weights on arbitrarily specified bit-depth
Artifact for SC21: APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores.
Check the effect of quantization on ResNets architecture
[WIP] PyTorch bindings for cublasLt with an example of quantized i8f16 MLP