cekcoco's repositories
llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
ppq
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
GPTQ-for-LLaMa
4 bits quantization of LLaMA using GPTQ
mmdeploy
OpenMMLab Model Deployment Framework
triton
Development repository for the Triton language and compiler
I-BERT
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
TPAT
TensorRT Plugin Autogen Tool
tpu-mlir
Machine learning compiler based on MLIR for Sophgo TPU.
Tengine
Tengine is a lite, high performance, modular inference engine for embedded device
tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
MQBench
Model Quantization Benchmark
onnx
Open standard for machine learning interoperability
Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
NumCpp
C++ implementation of the Python Numpy library
leetcode-master
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
chineseocr_lite
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
insightface
State-of-the-art 2D and 3D Face Analysis Project
nanodet
⚡Super fast and lightweight anchor-free object detection model. 🔥Only 980 KB(int8) / 1.8MB (fp16) and run 97FPS on cellphone🔥
EasyQuant
EasyQuant(EQ) is an efficient and simple post-training quantization method via effectively optimizing the scales of weights and activations.
BRECQ
Pytorch implementation of BRECQ, ICLR 2021
PytorchToCaffe
Pytorch model to caffe model, supported pytorch 0.3, 0.3.1, 0.4, 0.4.1 ,1.0 , 1.0.1 , 1.2 ,1.3 .notice that only pytorch 1.1 have some bugs
mlir
"Multi-Level Intermediate Representation" Compiler Infrastructure
Learn-Statistical-Learning-Method
Implementation of Statistical Learning Method, Second Edition.《统计学习方法》第二版,算法实现。
cnn-quantization
Quantization of Convolutional Neural networks.
pacnet
Pixel-Adaptive Convolutional Neural Networks (CVPR '19)
wincnn
Winograd minimal convolution algorithm generator for convolutional neural networks.