sirius93123's repositories
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
nncf
PyTorch*-based Neural Network Compression Framework for enhanced OpenVINO™ inference
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
pytorch_block_sparse
Fast Block Sparse Matrices for Pytorch
dlrm
An implementation of a deep learning recommendation model (DLRM)
open-earth-compiler
development repository for the open earth compiler
deepspeech.pytorch
Speech Recognition using DeepSpeech2.
AmpereSparseMatmul
study of Ampere' Sparse Matmul
EasyTransfer
EasyTransfer is designed to make the development of transfer learning in NLP applications easier.
HAWQ
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Model-Compression-Deploy
model compression and deploy. compression: 1、quantization: quantization-aware-training, 16/8/4/2-bit(dorefa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、ternary/binary(twn/bnn/xnor-net); post-training-quantization, 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization folding for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
hummingbird
Hummingbird compiles trained ML models into tensor computation for faster inference.
vnpy
基于Python的开源量化交易平台开发框架
DNN_NeuroSim_V2.1
Benchmark framework of compute-in-memory based accelerators for deep neural network (on-chip training chip focused)
FlexFlow
A distributed deep learning framework that supports flexible parallelization strategies.
pytorch-hessian-eigenthings
Efficient PyTorch Hessian eigendecomposition tools!
once-for-all
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
FlexTensor
Automatic Schedule Exploration and Optimization Framework for Tensor Computations
inter-operator-scheduler
IOS: Inter-Operator Scheduler for CNN Acceleration