Meng, Hengyu's starred repositories
intel-npu-acceleration-library
Intel® NPU Acceleration Library
neural-speed
An innovative library for efficient LLM inference via low-bit quantization
how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
neural-compressor
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
x86-64-minimal-JIT-compiler-Cpp
Writing a minimal x86-64 JIT compiler in C++
optimum-intel
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
intel-extension-for-tensorflow
Intel® Extension for TensorFlow*
mlir-hello
MLIR Sample dialect
awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
oneAPI-samples
Samples for Intel® oneAPI Toolkits
easy-just-in-time
LLVM Optimization to extract a function, embedded in its intermediate representation in the binary, and execute it using the LLVM Just-In-Time compiler.
onnx2pytorch
Transform ONNX model to PyTorch representation
GEMM_Optimization
Optimize GEMM. With AVX512 and AVX512-BF16, 800x improvement.
dpcpp-tutorial
Intel Data Parallel C++ (and SYCL 2020) Tutorial.
ipex_verbose
ipex verbose toolkit
PySparseConvNet
Python Framework for sparse neural networks
MinkowskiEngine
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors