Beast code in Giters

ZSL98's repositories

PAME

Early Exits of DNN Networks with TensorRT

Language:PythonMIT2 20

x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonMIT100

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION100

ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Language:VueMIT000

CuAssembler

An unofficial cuda assembler, for all generations of SASS, hopefully ：）

Language:PythonMIT000

cuda_hook

Hooked CUDA-related dynamic libraries by using automated code generation tools.

Language:CMIT000

DeepLearningExamples

Deep Learning Examples

Language:Python000

A Research-oriented Federated Learning Library. Supporting distributed computing, mobile/IoT on-device training, and standalone simulation. A short version of our white paper has been accepted by NeurIPS 2020 workshop.

Language:PythonGPL-3.0000

FedML-Server

FedML-Server: Federated Learning Server for FedML-IoT and FedML-Mobile

Language:Python000

gdev

First-Class GPU Resource Management: Device Drivers, Runtimes, and CUDA Compilers for Nouveau.

Language:CMIT000

leaf

Leaf: A Benchmark for Federated Settings

Language:PythonBSD-2-Clause010

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonNOASSERTION000

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

MIT000

nvsci

Linux kernel modules for secure sharing of memory buffers

Language:C000

orion

An interference-aware scheduler for fine-grained GPU sharing

Language:PythonMIT000

Shallow-Deep-Networks

Source Code for ICML 2019 Paper "Shallow-Deep Networks: Understanding and Mitigating Network Overthinking"

Language:PythonMIT000

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Language:PythonMIT000

Syte2

Syte2 is a personal website with interactive social integrations.

Language:JavaScriptMIT000

TBsche

010

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.0000

ZSL98

ZSL98's repositories

PAME

x-transformers

xformers

benchmarks

bitwise_spgemm

ccf-deadlines

CuAssembler

cuda_hook

DeepLearningExamples

Dubhe-proof

FedML

FedML-Server

gdev

leaf

Megatron-LM

nnfusion

nvsci

orion

Shallow-Deep-Networks

Swin-Transformer

Syte2

TBsche

TensorRT-LLM

TGS

Train-Nvidia

tSparse

tvm

v4

website

zsl98.github.io