dzy's repositories
micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
Torch-Pruning
[CVPR-2023] Towards Any Structural Pruning; LLMs / Diffusion / YOLOv8 / CNNs / Transformers
torchdynamo
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
aimet
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
BRECQ
Pytorch implementation of BRECQ, ICLR 2021
CenterNet
Object detection, 3D detection, and pose estimation using center point detection:
Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
how-to-learn-deep-learning-framework
how to learn PyTorch and OneFlow
IntraQ
Pytorch implementation of our paper accepted by CVPR 2022 -- IntraQ: Learning Synthetic Images with Intra-Class Heterogeneity for Zero-Shot Network Quantization
KuiperInfer
带你从零实现一个高性能的深度学习推理库,支持Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
mindspore
MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
MQBench
Model Quantization Benchmark
nndeploy
nndeploy is a cross-platform, high-performing, and straightforward AI model deployment framework. We strive to deliver a consistent and user-friendly experience across various inference framework in complex deployment environments and focus on performance. nndeploy一款跨平台、高性能、简单易用的模型端到端部署框架。我们致力于屏蔽不同推理框架的差异,提供一致且用户友好的编程体验,同时专注于部署全流程的性能。
onnx-modifier
A tool to modify onnx models in a visualization fashion, based on Netron and flask.
Python-100-Days
Python - 100天从新手到大师
python-patterns
A collection of design patterns/idioms in Python
pytorch-OpCounter
Count the MACs / FLOPs of your PyTorch model.
pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
Quantformer
This is the official pytorch implementation for the paper: *Quantformer: Learning Extremely Low-precision Vision Transformers*.
TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
torchdistill
A coding-free framework built on PyTorch for reproducible deep learning studies. 🏆20 knowledge distillation methods presented at CVPR, ICLR, ECCV, NeurIPS, ICCV, etc are implemented so far. 🎁 Trained models, training logs and configurations are available for ensuring the reproducibiliy and benchmark.
tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
vision
Datasets, Transforms and Models specific to Computer Vision