Beast code in Giters

kmcgrie's repositories

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Apache-2.0000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Apache-2.0000

triton

Development repository for the Triton language and compiler

MIT000

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Apache-2.0000

llvm

Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.

NOASSERTION000

llama.cpp

Port of Facebook's LLaMA model in C/C++

MIT000

KateBlueSky

kmcgrie's repositories

xetla

lc0

intel-extension-for-transformers

vllm

triton

TensorRT-LLM

llvm

llama.cpp