Yijia Diao (LittleQili)

LittleQili

Geek Repo

Company:Shanghai Jiao Tong University

Location:Shanghai, China

Github PK Tool:Github PK Tool


Organizations
SJTU-CSE

Yijia Diao's starred repositories

orion

An interference-aware scheduler for fine-grained GPU sharing

Language:PythonLicense:MITStargazers:90Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:4137Issues:0Issues:0

nccl-tests

NCCL Tests

Language:CudaLicense:BSD-3-ClauseStargazers:814Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:21567Issues:0Issues:0

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:1642Issues:0Issues:0

llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:336Issues:0Issues:0

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:2322Issues:0Issues:0

Fuser

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

Language:C++License:NOASSERTIONStargazers:246Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:26263Issues:0Issues:0

dlrover

DLRover: An Automatic Distributed Deep Learning System

Language:PythonLicense:NOASSERTIONStargazers:1194Issues:0Issues:0

distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Language:PythonLicense:MITStargazers:551Issues:0Issues:0

GPU-scheduler-for-deep-learning

GPU-scheduler-for-deep-learning

Language:C++License:MITStargazers:192Issues:0Issues:0

bamboo

Bamboo is a system for running large pipeline-parallel DNNs affordably, reliably, and efficiently using spot instances.

Language:PythonLicense:MITStargazers:46Issues:0Issues:0

TGS

Artifacts for our NSDI'23 paper TGS

Language:PythonLicense:Apache-2.0Stargazers:63Issues:0Issues:0

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

Language:C++License:Apache-2.0Stargazers:5899Issues:0Issues:0

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonLicense:Apache-2.0Stargazers:5237Issues:0Issues:0

gemma

Open weights LLM from Google DeepMind.

Language:PythonLicense:Apache-2.0Stargazers:2382Issues:0Issues:0

multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Language:CudaLicense:BSD-3-ClauseStargazers:525Issues:0Issues:0

PipeSwitch

PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications

Language:PythonLicense:Apache-2.0Stargazers:124Issues:0Issues:0
Language:C++Stargazers:40Issues:0Issues:0

mig-parted

MIG Partition Editor for NVIDIA GPUs

Language:GoLicense:Apache-2.0Stargazers:162Issues:0Issues:0

MIGProfiler

Multi-Instance-GPU profiling tool

Language:Jupyter NotebookLicense:MITStargazers:51Issues:0Issues:0

gdev

First-Class GPU Resource Management: Device Drivers, Runtimes, and CUDA Compilers for Nouveau.

Language:CLicense:MITStargazers:343Issues:0Issues:0

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:1124Issues:0Issues:0

gdev

First-Class GPU Resource Management: Device Drivers, Runtimes, and CUDA Compilers for Nouveau.

Language:CLicense:MITStargazers:44Issues:0Issues:0

slurm

Slurm: A Highly Scalable Workload Manager

Language:CLicense:NOASSERTIONStargazers:2572Issues:0Issues:0

TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Language:C++License:Apache-2.0Stargazers:10534Issues:0Issues:0

pygmtools

A Python Graph Matching Toolkit.

Language:PythonLicense:NOASSERTIONStargazers:289Issues:0Issues:0

nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs

Language:GoLicense:Apache-2.0Stargazers:2194Issues:0Issues:0

hidet

An open-source efficient deep learning framework/compiler, written in python.

Language:PythonLicense:Apache-2.0Stargazers:645Issues:0Issues:0