Beast code in Giters

haichaozhang's repositories

Tengine

Tengine is a lite, high performance, modular inference engine for embedded device

Language:C++Apache-2.0100

alpa_zhc

Auto parallelization for large-scale neural networks

Language:PythonApache-2.0000

apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Language:PythonBSD-3-Clause000

Cream

This is a collection of our NAS and Vision Transformer work.

Language:PythonMIT000

cub

Cooperative primitives for CUDA C++.

Language:CudaBSD-3-Clause000

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.0000

glslang

Khronos-reference front end for GLSL/ESSL, partial front end for HLSL, and a SPIR-V generator.

NOASSERTION000

Hetu

A high-performance distributed deep learning system targeting large-scale and automated distributed training.

Language:PythonApache-2.0000

mmdeploy

OpenMMLab Model Deployment Framework

Language:PythonApache-2.0000

multibuild

Machinery for building and testing Python Wheels for Linux, OSX and (less flexibly) Windows.

NOASSERTION000

nccl

Optimized primitives for collective multi-GPU communication

NOASSERTION000

ncnn

ncnn is a high-performance neural network inference framework optimized for the mobile platform

NOASSERTION000

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

MIT000

opencv

Open Source Computer Vision Library

Language:C++Apache-2.0000

opencv-python

Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.

MIT000

opencv_contrib

Repository for OpenCV's extra modules

Apache-2.0000

optimum-quanto

A pytorch quantization backend for optimum

Apache-2.0000

ppl.cv

ppl.cv is a high-performance image processing library of openPPL supporting various platforms.

Apache-2.0000

ppl.nn

A primitive library for neural network

Apache-2.0000

pybind11

Seamless operability between C++11 and Python

NOASSERTION000

qtbase

Qt Base (Core, Gui, Widgets, Network, ...)

000

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Language:PythonMIT000

TensorRT

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

Language:C++Apache-2.0000

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Apache-2.0000

traitlets

A lightweight Traits like module

NOASSERTION000

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonApache-2.0000

haichaozhang

haichaozhang's repositories

Tengine

alpa_zhc

apex

Cream

cub

FasterTransformer

glslang

Hetu

mmdeploy

multibuild

nccl

ncnn

Next-ViT

onnxruntime

opencv

opencv-python

opencv_contrib

opencv_extra

optimum-quanto

ppl.cv

ppl.nn

pybind11

qt5

qtbase

spdlog

Swin-Transformer

TensorRT

TensorRT-LLM

traitlets

tvm