Yuan (zhouyuan)

zhouyuan

Geek Repo

Company:@Intel-bigdata

Github PK Tool:Github PK Tool

Yuan's starred repositories

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonLicense:MITStargazers:8402Issues:79Issues:31

vanna

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.

Language:PythonLicense:MITStargazers:7466Issues:47Issues:217

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:6946Issues:83Issues:1368

DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

Language:PythonLicense:MITStargazers:5648Issues:63Issues:140

LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Language:PythonLicense:MITStargazers:4005Issues:32Issues:97

Anima

33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3347Issues:97Issues:129

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonLicense:Apache-2.0Stargazers:2503Issues:30Issues:228

blitzar

Zero-knowledge proof acceleration with GPUs for C++ and Rust

Language:C++License:Apache-2.0Stargazers:2274Issues:44Issues:2

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:1692Issues:40Issues:273

executorch

On-device AI across mobile, embedded and edge for PyTorch

Language:C++License:NOASSERTIONStargazers:1334Issues:52Issues:249

Lumos

A RAG LLM co-pilot for browsing the web, powered by local LLMs

Language:TypeScriptLicense:MITStargazers:1275Issues:9Issues:93

splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Language:PythonLicense:MITStargazers:1115Issues:17Issues:630

sneller

World's fastest log analysis: λ + SQL + JSON + S3

Language:GoLicense:NOASSERTIONStargazers:974Issues:22Issues:7

gluten

Gluten: Plugin to Double SparkSQL's Performance

Language:ScalaLicense:Apache-2.0Stargazers:920Issues:31Issues:1303

yet-another-applied-llm-benchmark

A benchmark to evaluate language models on questions I've previously asked them to solve.

Language:PythonLicense:GPL-3.0Stargazers:782Issues:15Issues:7

incubator-xtable

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Language:JavaLicense:Apache-2.0Stargazers:702Issues:26Issues:194

IvorySQL

Open Source Oracle Compatible PostgreSQL.

Language:CLicense:Apache-2.0Stargazers:699Issues:32Issues:341

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:697Issues:13Issues:50

RingAttention

Transformers with Arbitrarily Large Context

Language:PythonLicense:Apache-2.0Stargazers:544Issues:5Issues:13

dbchaos

Stress-test your database with pre-defined queries. Generate synthetic data and events statically or with GPT.

Language:GoLicense:MITStargazers:430Issues:4Issues:6

sql-eval

Evaluate the accuracy of LLM generated outputs

Language:PythonLicense:Apache-2.0Stargazers:415Issues:7Issues:14

rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Language:C++License:Apache-2.0Stargazers:400Issues:11Issues:55

sort-research-rs

Test and benchmark suite for sort implementations.

Language:RustLicense:Apache-2.0Stargazers:291Issues:8Issues:13

mamba.c

Inference of Mamba models in pure C

libCacheSim

a high performance cache simulator and library

Language:CLicense:GPL-3.0Stargazers:89Issues:4Issues:6

avx_qsort

Quick sort code using AVX2 instructions

Language:AssemblyStargazers:67Issues:6Issues:0

Gluten-Trino

Gluten: Plugin to Boost Trino's Performance

Language:JavaLicense:Apache-2.0Stargazers:66Issues:8Issues:26

crystal

GPU library for writing SQL queries

Language:CLicense:MITStargazers:49Issues:4Issues:8

storage-testbench

A testbench for Google Cloud Storage client libraries.

Language:PythonLicense:Apache-2.0Stargazers:8Issues:38Issues:97