Honggyu Kim's starred repositories
open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
llama-cpp-python
Python bindings for llama.cpp
ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, DeepSpeed, vLLM, FastChat, Axolotl, etc.
FasterTransformer
Transformer related optimization, including BERT, GPT
kernel-development
Presentation on how the Linux kernel is developed
llama3.cuda
llama3.cuda is a pure C/CUDA implementation for Llama 3 model.
optimum-benchmark
A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
precise-leak-sanitizer
A dynamic memory leak detector that can pinpoint where memory is lost, using LLVM pass
grace-kernel
Upstream Kernel with Grace upstream pending patches for partners. Patches include any bug fixes during Grace production while they await upstreaming.