Beast code in Giters

The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers

Language:NimApache-2.026600

TNN

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.

Language:C++NOASSERTION434400

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Language:C++NOASSERTION1984300

turingas

Assembler for NVIDIA Volta and Turing GPUs

Language:PythonMIT19000

12306

12306智能刷票，订票

Language:PythonMIT3373100

triton

Development repository for the Triton language and compiler

Language:C++MIT1208400

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++Apache-2.01034200

zh-google-styleguide

Google 开源项目风格指南 (中文版)

Language:Makefile1048500

maxas

Assembler for NVIDIA Maxwell architecture

Language:SassMIT93500

caffe-fixedpoint

minimized caffe, include only inference part, and support fixed point computation

Language:C++600

myconfig

Language:VimL600

asfermi

assembler for NVIDIA FERMI. Imported from Google Code

Language:C++6600

gpgpu-sim_distribution

GPGPU-Sim provides a detailed simulation model of a contemporary GPU (such as NVIDIA's Fermi and GT200 architectures) running CUDA and/or OpenCL workloads and now includes an integrated (and validated) energy model, GPUWattch.

Language:C++NOASSERTION100

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonApache-2.01145600

gpgpu-sim_distribution

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.

Language:C++NOASSERTION104300

CaffeModelCompression

Tool to compress trained caffe weights

Language:C10600

caffe

Caffe for Sparse and Low-rank Deep Neural Networks

Language:C++NOASSERTION37400