Luchang Li's repositories
export_llama_to_onnx
export llama to onnx
onnxsim_large_model
simplify >2GB large onnx model
BFGS-Optimization-for-curve-fitting
use BFGS optimization algorithm to solve problems like curve fitting
android_ndk_examples
android_ndk_examples
CppTemplateTutorial
中文的C++ Template的教学指南。与知名书籍C++ Templates不同,该系列教程将C++ Templates作为一门图灵完备的语言来讲授,以求帮助读者对Meta-Programming融会贯通。(正在施工中)
decoupleQ
A quantization algorithm for LLM
DeepLearningExamples
Deep Learning Examples
EAGLE
EAGLE: Lossless Acceleration of LLM Decoding by Feature Extrapolation
EasyNLP
EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
flash-attention-minimal
Flash Attention in ~100 lines of CUDA (forward pass only)
gemmlowp
Low-precision matrix multiplication
HashingDeepLearning
Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"
kernel_tuner
Kernel Tuner
llm-awq
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
modelbox
为AI应用的开发者提供一套统一的高性能、易用的编程框架,快速基于AI全栈服务、开发跨端边云的AI行业应用。
MVision
机器人视觉 移动机器人 VS-SLAM ORB-SLAM2 深度学习目标检测 yolov3 行为检测 opencv PCL 机器学习 无人驾驶
OpenCL-examples-1
Simple OpenCL examples for exploiting GPU computing
pytorch-cifar
95.47% on CIFAR10 with PyTorch
speculative-decoding
Explorations into some recent techniques surrounding speculative decoding
tensorflow-1
An Open Source Machine Learning Framework for Everyone
UGATIT-pytorch
Official PyTorch implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
weight_only_quant_rot
weight only quantization with rotation
xla
Enabling PyTorch on Google TPU
YHs_Sample
Yinghan's Code Sample