ltj2013's starred repositories
AutoKernel
AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。
Fractional-GPUs
Splits single Nvidia GPU into multiple partitions with complete compute and memory isolation (wrt to performace) between the partitions
tvm-cuda-int8-benchmark
Benchmark of TVM quantized model on CUDA
TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
zh-google-styleguide
Google 开源项目风格指南 (中文版)
caffe-fixedpoint
minimized caffe, include only inference part, and support fixed point computation
gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of a contemporary GPU (such as NVIDIA's Fermi and GT200 architectures) running CUDA and/or OpenCL workloads and now includes an integrated (and validated) energy model, GPUWattch.
gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.
CaffeModelCompression
Tool to compress trained caffe weights