ptq

There are 0 repository under ptq topic.

Xilinx / brevitas
Brevitas: neural network quantization in PyTorch
quantization pytorch brevitas fpga neural-networks hardware-acceleration xilinx deep-learning ptq qat
Language:Python 1140
Bobo-y / flexible-yolov5
More readable and flexible yolov5 with more backbone(gcn, resnet, shufflenet, moblienet, efficientnet, hrnet, swin-transformer, etc) and (cbam，dcn and so on), and tensorrt
yolov5 resnet moblienet backbone neck cbam pytorch shufflenet hrnet dcnv2 tensorrt object-detection swin-transformer gcn yolov3 triton-server ptq qat sparsity
Language:Python 661
sony / model_optimization
Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. This project provides researchers, developers, and engineers advanced quantization and compression tools for deploying state-of-the-art neural networks.
deep-learning deep-neural-networks edge-ai machine-learning network-compression network-quantization neural-network optimizer ptq pytorch qat quantization tensorflow
Language:Python 295
ModelTC / llmc
This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
deployment llm pruning quantization tool benchmark evaluation large-language-models awq internlm2 llama3 smoothquant omniquant post-training-quantization ptq minicpm mixtral wanda quarot smollm
Language:Python 200
yester31 / TensorRT_API
Deep Learning Model Optimization Using by TensorRT API, window
cuda detr ptq pytorch quantization resnet tensorrt unet vgg yolov5 yolov6
Language:Python 17
MAGICS-LAB / OutEffHop
[ICML 2024] Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
softmax-1 attention-mechanism modern-hopfield-model modern-hopfield-networks outlier-removal transformer quantized-friendly hopfield-neural-network ptq icml-2024 outlier-treatment outliers attention no-op-outlier outlier
Language:Python 15
yester31 / Quantization_EX
quantization example for pqt & qat
int8 model-optimization post-training-quantization ptq pytorch-quantization qat quantization quantization-aware-training tensorrt
Language:Python 4
yester31 / TensorRT_ONNX
Generating tensorrt model using onnx
int8-inference int8-quantization onnx onnxruntime post-training-quantization ptq pytorch quantization tensorrt tensorrt-inference
Language:C++ 3
yester31 / TensorRT_Sparse
inference with the structured sparsity and quantization
accelerate-the-inference ptq quantization sparse-int8-model sparse-tensor-cores sparsity-pattern structured-sparsity tensorrt
Language:Python 3
BlindOver / blindover_AI
Build AI model to classify beverages for blind individuals
ai classification deep-learning mobile-app pytorch quantization efficientnet mobilenetv3 resnet shufflenetv2 ptq qat
Language:Python 2
smpanaro / norm-tweaking
Post post-training-quantization (PTQ) method for improving LLMs. Unofficial implementation of https://arxiv.org/abs/2309.02784
llms post-training-quantization quantization ptq
Language:Python 2
lix19937 / tensorrt-insight
deep insight tensorrt,nvidia
asp ptq qat tensorrt nvidia
Language:C++ 1
OmidGhadami95 / EfficientNetV2_Quantization_CK
EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.
efficientnet post-training-quantization ptq qat quantization quantization-aware-training efficientnetv2 ckplus efficientnetv2-b2 emotion-recognition facial-emotion-recognition imbalanced-dataset real-time-emotion-classification real-time-emotion-detection scale-down googlecolab keras python tensorflow
Language:Jupyter Notebook 1
amajji / LLM-Quantization-Techniques-Absmax-Zeropoint-GPTQ-GGUF
LLM quantization techniques: absmax, zero-point, GPTQ and GGUF
absolute ggml gguf gptq llamacpp llm ptq quantization quantization-aware-training zeropoint absmax
Language:Jupyter Notebook
ambideXtrous9 / Quantization-of-Models-PTQ-and-QAT
Quantization of Models : Post-Training Quantization(PTQ) and Quantize Aware Training(QAT)
ptq qat quantization quantization-aware-training keras pytorch pytorch-implementation tflite tflite-models
Language:Jupyter Notebook