Yi Wang (wangkuiyi)

wangkuiyi

User data from Github https://github.com/wangkuiyi

Company:Facebook

Location:San Francisco Bay Area, CA

Home Page:https://www.linkedin.com/in/yidewang

GitHub:@wangkuiyi


Organizations
elasticdl

Yi Wang's repositories

gotorch

A Go idiomatic binding to the C++ core of PyTorch

Language:GoLicense:MITStargazers:338Issues:15Issues:68

alpaca.cpp-ios

Locally run an Instruction-Tuned Chat-Style LLM

Language:CLicense:MITStargazers:1Issues:0Issues:0

Adv360-Pro-ZMK

Production repository for the all-new Advantage360 Professional using ZMK engine

Language:MakefileLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

cpuinfo

CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)

Language:CLicense:BSD-2-ClauseStargazers:0Issues:0Issues:0

iree

👻

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

iree-for-apple-platforms

This project builds the IREE compiler for macOS and the IREE runtime for macOS, iOS, watchOS, and tvOS

Language:ShellStargazers:0Issues:2Issues:3
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

jax-triton

jax-triton contains integrations between JAX and OpenAI Triton

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

lam2s

lam2s = Lean And Mean LAnguagle Model Serving

Stargazers:0Issues:0Issues:0
Language:C++Stargazers:0Issues:1Issues:0
Language:Objective-CStargazers:0Issues:1Issues:0

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ml_collections

ML Collections is a library of Python Collections designed for ML use cases.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mlx

MLX: An array framework for Apple silicon

Language:C++License:MITStargazers:0Issues:0Issues:0

mlx-examples

Examples in the MLX framework

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

mlx-lm

Run LLMs with MLX

License:MITStargazers:0Issues:0Issues:0
Language:JavaScriptLicense:Apache-2.0Stargazers:0Issues:0Issues:0

pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

pytorch_memlab

Profiling and inspecting memory in pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

sglang

SGLang is a fast serving framework for large language models and vision language models.

License:Apache-2.0Stargazers:0Issues:0Issues:0

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

tensorrtllm_backend

The Triton TensorRT-LLM Backend

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

torchft

PyTorch per step fault tolerance (actively under development)

License:NOASSERTIONStargazers:0Issues:0Issues:0

xgrammar

Fast, Flexible and Portable Structured Generation

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0