Lei Wang's repositories
ZYNQ-NVDLA
NVDLA (An Opensource DL Accelerator Framework) implementation on FPGA.
tvm_gpu_gemm
play gemm with tvm
AutoGPTQ.tvm
GPTQ inference TVM kernel
VehicleFlowDetection
Implement of vehicle flow statistics based on tensorflow and yolo3 with pyqt5 GUI.
leiblog.wang
My New Blog Powered by HEXO http://leiblog.wang
cutlass_fpA_intB_gemm
A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer
vllm-bitblas
A high-throughput and memory-efficient inference and serving engine for LLMs
gptq_faster
Faster 3bit CUDA Kernel for gptq.
Welder_artifacts
OSDI 2023 WElder artifacts