Michael Mi (guocuimi)

guocuimi

Geek Repo

Company:Vectorch

Location:Bellevue, WA

Github PK Tool:Github PK Tool

Michael Mi's starred repositories

chatbot-ui

AI chat for every model.

Language:TypeScriptLicense:MITStargazers:27256Issues:242Issues:934

triton

Development repository for the Triton language and compiler

mamba

Mamba SSM architecture

Language:PythonLicense:Apache-2.0Stargazers:11449Issues:98Issues:386

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:10661Issues:67Issues:664

Yi

A series of large language models trained from scratch by developers @01-ai

Language:PythonLicense:Apache-2.0Stargazers:7428Issues:112Issues:287

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:752Issues:13Issues:62

Jinja2Cpp

Jinja2 C++ (and for C++) almost full-conformance template engine implementation

Language:C++License:MPL-2.0Stargazers:478Issues:17Issues:133

nvbench

CUDA Kernel Benchmarking Library

Language:CudaLicense:Apache-2.0Stargazers:439Issues:18Issues:89

ByteTransformer

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

Language:C++License:Apache-2.0Stargazers:434Issues:10Issues:10

hal-9100

Edge full-stack LLM platform. Written in Rust

Language:RustLicense:MITStargazers:361Issues:11Issues:79

calm

CUDA/Metal accelerated language model inference

Language:CLicense:MITStargazers:335Issues:9Issues:0

pyglove

Manipulating Python Programs

Language:PythonLicense:Apache-2.0Stargazers:320Issues:6Issues:25

ScaleLLM

A high-performance inference system for large language models, designed for production environments.

Language:C++License:Apache-2.0Stargazers:315Issues:15Issues:65

run-clang-format

A wrapper script around clang-format, suitable for linting multiple files and to use for continuous integration

Language:PythonLicense:MITStargazers:235Issues:7Issues:22

cuda_hgemm

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Language:CudaLicense:MITStargazers:228Issues:4Issues:11

MatmulTutorial

A Easy-to-understand TensorOp Matmul Tutorial

Language:C++License:Apache-2.0Stargazers:212Issues:8Issues:9

sirius

A Plonkish folding framework for Incrementally Verifiable Computation (IVC).

Language:RustLicense:MITStargazers:105Issues:5Issues:81

langfun

Empower LLMs with Symbols.

Language:PythonLicense:Apache-2.0Stargazers:84Issues:5Issues:1

SA-Segment-Anything

Vision-oriented multimodal AI

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:46Issues:4Issues:0

LLMBench

A library for validating and benchmarking LLMs inference.

Language:PythonLicense:Apache-2.0Stargazers:4Issues:2Issues:1