Beast code in Giters

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++Apache-2.010356 157 3587

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Language:C++8502 201 2517

cudf

cuDF - GPU DataFrame Library

Language:C++Apache-2.08089 149 6284

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++NOASSERTION5004 108 951

mace

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Language:C++Apache-2.04900 230 677

MegEngine

MegEngine 是一个快速、可拓展、易于使用且支持自动求导的深度学习框架

Language:C++Apache-2.04742 137 370

HIP

HIP: C++ Heterogeneous-Compute Interface for Portability

Language:C++MIT3635 143 851

patchelf

A small utility to modify the dynamic linker and RPATH of ELF executables

Language:CGPL-3.03402 78 260

jittor

Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.

Language:PythonApache-2.03041 62 340

TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Language:PythonBSD-3-Clause2436 69 1446

asm

Learning assembly for Linux x86_64

Language:AssemblyNOASSERTION2164 97 10

tinyflow

Tutorial code on how to build your own Deep Learning System in 2k Lines

Language:C++Apache-2.02004 83 8

sequence_tagging

Named Entity Recognition (LSTM + CRF) - Tensorflow

Language:PythonApache-2.01944 73 83

how-to-optimize-gemm

Language:C1684 45 17

TurboTransformers

a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

Language:C++NOASSERTION1457 41 118

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

Language:C++MIT939 44 205

maxas

Assembler for NVIDIA Maxwell architecture

Language:SassMIT935 88 11

runtime

A performant and modular runtime for TensorFlow

Language:C++Apache-2.0750 51 72

d2l-tvm

Dive into Deep Learning Compiler

Language:Python630 39 6

CS143-Compilers-Stanford

My solutions to the programming assignments of the Stanford Compiler course.

Language:C++340 30

gdev

First-Class GPU Resource Management: Device Drivers, Runtimes, and CUDA Compilers for Nouveau.

Language:CMIT337 39 35

stack-machine

A simple stack-based virtual machine in C++ with a Forth like programming language

Language:C++164 14 1

rnn_benchmarks

RNN benchmarks of pytorch, tensorflow and theano

Language:Python87 9 3

python-cuda-profile

Language:Python32 2 1