There are 2 repositories under efficient-inference topic.
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"
Learning Efficient Convolutional Networks through Network Slimming, In ICCV 2017.
[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference
(CVPR 2021, Oral) Dynamic Slimmable Network
This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"
Implementation of AAAI 21 paper: Nested Named Entity Recognition with Partially Observed TreeCRFs
Jia-Hong Lee, Yi-Ming Chan, Ting-Yen Chen, and Chu-Song Chen, "Joint Estimation of Age and Gender from Unconstrained Face Images using Lightweight Multi-task CNN for Mobile Applications," IEEE International Conference on Multimedia Information Processing and Retrieval, MIPR 2018
[ECCV 2020] Code release for "Resolution Switchable Networks for Runtime Efficient Image Recognition"
Concise, Modular, Human-friendly PyTorch implementation of EfficientNet with Pre-trained Weights.
[ICLR 2020] ”Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference“
Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen, "Unifying and Merging Well-trained Deep Neural Networks for Inference Stage," International Joint Conference on Artificial Intelligence (IJCAI), 2018
List of papers related to neural network quantization in recent AI conferences and journals (please refer to README in the folder for a better view).
Code for paper: 'Multi-Component Optimization and Efficient Deployment of Neural-Networks on Resource-Constrained IoT Hardware'
[FPGA'21] CoDeNet is an efficient object detection model on PyTorch, with SOTA performance on VOC and COCO based on CenterNet and Co-Designed deformable convolution.
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
Cheng-Hao Tu, Jia-Hong Lee, Yi-Ming Chan and Chu-Song Chen, "Pruning Depthwise Separable Convolutions for MobileNet Compression," International Joint Conference on Neural Networks, IJCNN 2020, July 2020.
Supplementary material for IEEE Services Computing paper title 'An SRAM Optimized Approach for Constant Memory Consumption and Ultra-fast Execution of ML Classifiers on TinyML Hardware'
Repo of paper: TinyML Benchmark: Executing Fully Connected Neural Networks on Commodity Microcontrollers
Cheng-En Wu, Yi-Ming Chan and Chu-Song Chen "On Merging MobileNets for Efficient Multitask Inference", International Symposium on High-Performance Computer Architecture(HPCA) on Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications(EMC2), 2019
Finding Storage- and Compute-Efficient Convolutional Neural Networks
Extremely light-weight MixNet with Top-1 75.7% and 2.5M params
NeurIPS 2019 MicroNet Challenge
Code for paper: 'Edge2Train: a framework to train machine learning models (SVMs) on resource-constrained IoT edge devices'
[MicroNet Challenge (NeurIPS 2019 )] "Adjustable Quantization: Jointly Learn the Bit-width and Weight in DNN Training" by Yonggan Fu, Ruiyang Zhao, Yue Wang, Chaojian Li, Haoran You, Zhangyang Wang, Yingyan Lin
Repository of the ECML PKDD 2021 tutorial title 'Machine Learning Meets Internet of Things: From Theory to Practice'
Exploring Variational Deep Q Networks. A study undertaken for the University of Cambridge's R244 Computer Science Masters Course. Inspired by https://arxiv.org/abs/1711.11225/.
Compute-efficient reinforcement learning with binary neural networks and evolution strategies.
A Post-Training Quantizer for the Design of Mixed Low-Precision DNNs with Dynamic Fixed-Point Representation for Efficient Hardware Acceleration on Edge Devices