Yijia Diao (LittleQili)

LittleQili

Geek Repo

Company:Shanghai Jiao Tong University

Location:Shanghai, China

Github PK Tool:Github PK Tool


Organizations
SJTU-CSE

Yijia Diao's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:131831Issues:1117Issues:15657

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:23217Issues:226Issues:132

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonLicense:Apache-2.0Stargazers:15762Issues:104Issues:1016

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:10286Issues:66Issues:105

server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Language:PythonLicense:BSD-3-ClauseStargazers:8020Issues:139Issues:3698

tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

Language:SystemVerilogStargazers:6910Issues:68Issues:22

corenet

CoreNet: A library for training deep neural networks

Language:PythonLicense:NOASSERTIONStargazers:6909Issues:63Issues:20

miniforge

A conda-forge distribution.

Language:ShellLicense:NOASSERTIONStargazers:6132Issues:55Issues:362

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:4134Issues:35Issues:1335

Liger-Kernel

Efficient Triton Kernels for LLM Training

Language:PythonLicense:BSD-2-ClauseStargazers:2857Issues:33Issues:56

ThunderKittens

Tile primitives for speedy kernels

Language:CudaLicense:MITStargazers:1479Issues:25Issues:22

gpushare-scheduler-extender

GPU Sharing Scheduler for Kubernetes Cluster

Language:GoLicense:Apache-2.0Stargazers:1387Issues:39Issues:149

AzurePublicDataset

Microsoft Azure Traces

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:781Issues:37Issues:35

Overleaf-Workshop

Open Overleaf/ShareLaTex projects in vscode, with full collaboration support.

Language:TypeScriptLicense:AGPL-3.0Stargazers:443Issues:3Issues:93

BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Language:PythonLicense:MITStargazers:332Issues:13Issues:53

DistServe

Disaggregated serving system for Large Language Models (LLMs).

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:272Issues:4Issues:37

rccl

ROCm Communication Collectives Library (RCCL)

Language:C++License:NOASSERTIONStargazers:248Issues:32Issues:90

flux

A fast communication-overlapping library for tensor parallelism on GPUs.

Language:C++License:Apache-2.0Stargazers:177Issues:7Issues:21

Caffeine

Caffeine for macOS 11+

Language:SwiftLicense:MITStargazers:138Issues:1Issues:0

TiledCUDA

TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.

Language:C++License:MITStargazers:101Issues:4Issues:53

SpotServe

SpotServe: Serving Generative Large Language Models on Preemptible Instances

orion

An interference-aware scheduler for fine-grained GPU sharing

Language:PythonLicense:MITStargazers:89Issues:2Issues:17

paella

Paella: Low-latency Model Serving with Virtualized GPU Scheduling

pyjuice

Scalable training and inference for Probabilistic Circuits

Language:PythonLicense:Apache-2.0Stargazers:44Issues:5Issues:4

rccl-tests

RCCL Performance Benchmark Tests

Language:CudaLicense:NOASSERTIONStargazers:41Issues:10Issues:8

TiledKernel

TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.

Language:C++License:MITStargazers:16Issues:2Issues:0
Language:PythonLicense:NOASSERTIONStargazers:14Issues:3Issues:0

chipgptft

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework (DAC 2024)

Language:PythonStargazers:9Issues:0Issues:0