Jiashu's starred repositories

SpotServe

SpotServe: Serving Generative Large Language Models on Preemptible Instances

License:Apache-2.0Stargazers:71Issues:0Issues:0

readerwriterqueue

A fast single-producer, single-consumer lock-free queue for C++

Language:C++License:NOASSERTIONStargazers:3556Issues:0Issues:0

lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Language:PythonLicense:Apache-2.0Stargazers:1853Issues:0Issues:0

transformer-walkthrough

A walkthrough of transformer architecture code

Language:Jupyter NotebookLicense:MITStargazers:291Issues:0Issues:0
Language:CudaLicense:BSD-2-ClauseStargazers:108Issues:0Issues:0

intel-extension-for-deepspeed

Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note XPU is already supported by stock DeepSpeed.

Language:C++License:MITStargazers:54Issues:0Issues:0

generative-ai-for-beginners

18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Language:Jupyter NotebookLicense:MITStargazers:49928Issues:0Issues:0

DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Language:C++License:Apache-2.0Stargazers:4979Issues:0Issues:0

dlrm_datasets

Set of datasets for the deep learning recommendation model (DLRM).

License:MITStargazers:39Issues:0Issues:0

sc23-dl-tutorial

SC23 Deep Learning at Scale Tutorial Material

Language:PythonStargazers:28Issues:0Issues:0

eecs598

Advanced Topics on Systems for X

Stargazers:252Issues:0Issues:0
Language:C++License:Apache-2.0Stargazers:121Issues:0Issues:0

llm-analysis

Latency and Memory Analysis of Transformer Models for Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:305Issues:0Issues:0

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonLicense:Apache-2.0Stargazers:883Issues:0Issues:0

scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

Language:PythonLicense:Apache-2.0Stargazers:11334Issues:0Issues:0

qdrant

Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Language:RustLicense:Apache-2.0Stargazers:18713Issues:0Issues:0

llmperf

LLMPerf is a library for validating and benchmarking LLMs

Language:PythonLicense:Apache-2.0Stargazers:470Issues:0Issues:0

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonLicense:Apache-2.0Stargazers:1751Issues:0Issues:0

kineto

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

Language:HTMLLicense:NOASSERTIONStargazers:649Issues:0Issues:0

how-to-optimize-gemm

row-major matmul optimization

Language:C++License:GPL-3.0Stargazers:555Issues:0Issues:0

minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Language:PythonLicense:MITStargazers:19363Issues:0Issues:0

mindsdb

The platform for building AI from enterprise data

Language:PythonLicense:NOASSERTIONStargazers:22989Issues:0Issues:0

Megatron-LM

Ongoing research training transformer models at scale

Language:PythonLicense:NOASSERTIONStargazers:9233Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:1108Issues:0Issues:0
Language:CudaStargazers:1978Issues:0Issues:0

ElasticFlow

Artifacts for our ASPLOS'23 paper ElasticFlow

Language:PythonLicense:Apache-2.0Stargazers:49Issues:0Issues:0

gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

Language:JavaScriptStargazers:699Issues:0Issues:0
Language:PythonStargazers:19Issues:0Issues:0

FastCkpt

Python package for rematerialization-aware gradient checkpointing

Language:PythonLicense:Apache-2.0Stargazers:22Issues:0Issues:0
Language:CStargazers:101Issues:0Issues:0