LeiWang1999

Antares: an automatic engine for multi-platform kernel generation and optimization. Supporting CPU, CUDA, ROCm, DirectX12, GraphCore, SYCL for CPU/GPU, OpenCL for AMD/NVIDIA, Android CPU/GPU backends.

Language:PythonNOASSERTION010

apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Language:PythonBSD-3-Clause010

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Language:PythonMIT010

AutoGPTQ_nf

Language:PythonMIT010

ComputeShaderPlayground

Compute Shader Playground with DirectX12

Language:C++020

gptq_faster

Faster 3bit CUDA Kernel for gptq.

Language:PythonApache-2.0010

LeiWang1999

010

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonApache-2.0010

nmsparse

Language:HTML010

nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

Language:PythonMIT010

ppq

PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.

Language:PythonApache-2.0010

relax

Language:PythonApache-2.0010

ucas-covid19

ucas疫情防控每日填报助手

Language:Python010

Welder_artifacts

OSDI 2023 WElder artifacts

000

LeiWang1999

Lei Wang's repositories

FPGA

ZYNQ-NVDLA

AICS-Course

tvm_gpu_gemm

AutoGPTQ.tvm

VehicleFlowDetection

nvdla-parser

HPC-Course

leiblog.wang

rocblas-benchmark

LeiBlog

cv

tvm

compiler-and-arch

_cutlass

nnfusion

antares

apex

AutoGPTQ

AutoGPTQ_nf

ComputeShaderPlayground

gptq_faster

LeiWang1999

mlc-llm

nmsparse

nni

ppq

relax

ucas-covid19

Welder_artifacts