Yang (ybai62868)

ybai62868

Geek Repo

Company:CUHK

Location:Hong Kong

Home Page:https://ybai62868.github.io/

Github PK Tool:Github PK Tool

Yang's starred repositories

AI-For-Beginners

12 Weeks, 24 Lessons, AI for All!

Language:Jupyter NotebookLicense:MITStargazers:34392Issues:402Issues:111

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:23904Issues:241Issues:139

codellama

Inference code for CodeLlama models

Language:PythonLicense:NOASSERTIONStargazers:15942Issues:185Issues:195

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonLicense:Apache-2.0Stargazers:8234Issues:72Issues:409

tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

Language:SystemVerilogStargazers:6985Issues:68Issues:23

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:6649Issues:37Issues:1097

LLMLingua

To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Language:PythonLicense:MITStargazers:4498Issues:33Issues:120

matmulfreellm

Implementation for MatMul-free LM.

Language:PythonLicense:Apache-2.0Stargazers:2896Issues:43Issues:31

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Language:PythonLicense:Apache-2.0Stargazers:1872Issues:27Issues:121

ThunderKittens

Tile primitives for speedy kernels

Language:CudaLicense:MITStargazers:1528Issues:24Issues:26

Time-LLM

[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:1304Issues:15Issues:147

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonLicense:MITStargazers:1257Issues:27Issues:44

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1019Issues:10Issues:11

VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Language:PythonLicense:NOASSERTIONStargazers:839Issues:38Issues:45

Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Language:CudaLicense:Apache-2.0Stargazers:577Issues:6Issues:15

depyf

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Language:PythonLicense:MITStargazers:467Issues:8Issues:25

basalt

A Machine Learning framework from scratch in Pure Mojo 🔥

Language:MojoLicense:NOASSERTIONStargazers:399Issues:12Issues:38

timeloop

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.

Language:C++License:BSD-3-ClauseStargazers:328Issues:21Issues:179

zero-bubble-pipeline-parallelism

Zero Bubble Pipeline Parallelism

Language:PythonLicense:NOASSERTIONStargazers:264Issues:6Issues:26

xdsl

A Python Compiler Design Toolkit

Language:PythonLicense:NOASSERTIONStargazers:257Issues:19Issues:429

vidur

A large-scale simulation framework for LLM inference

Language:PythonLicense:MITStargazers:251Issues:6Issues:17

matmul.c

Fast multi-threaded matrix multiplication in C

Language:CLicense:MITStargazers:170Issues:5Issues:0

ParrotServe

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Language:PythonLicense:MITStargazers:99Issues:5Issues:4

TiledKernel

TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.

Language:C++License:MITStargazers:18Issues:2Issues:0

sgemm_riscv

This project records the process of optimizing SGEMM (single-precision floating point General Matrix Multiplication) on the riscv platform.

Language:CLicense:MITStargazers:15Issues:0Issues:0

rvv-kernels

RISCV Vector Kernel C/LLVM-IR generator

Language:CLicense:Apache-2.0Stargazers:5Issues:2Issues:0

Ansor-AF-DS

This repository contains the figures, tables and source code in the ICS'24 paper: "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations".

Language:PythonStargazers:5Issues:0Issues:0