SiriusNEO

followers

following

stars

Shanghai Jiao Tong University

Shanghai, China

Chaofan Lin's starred repositories

cs-self-learning

计算机自学指南

Language:HTMLMIT52894 312 173

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION23351 190 196

downkyi

哔哩下载姬downkyi，哔哩哔哩网站视频下载工具，支持批量下载，支持8K、HDR、杜比视界，提供工具箱（音视频提取、去水印等）。

Language:C#GPL-3.019992 141 1053

candle

Minimalist ML framework for Rust

Language:RustApache-2.014595 146 622

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonMIT14563 129 597

PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

Language:C++MIT7666 75 151

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonApache-2.03404 33 1089

SJTUThesis

上海交通大学 LaTeX 论文模板 | Shanghai Jiao Tong University LaTeX Thesis Template

Language:TeXApache-2.03265 54 484

Awesome-GPTs

Curated list of awesome GPTs 👍.

GPL-3.02975 24 145

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonApache-2.02864 30 286

Checkpoint

Fast and simple homebrew save manager for 3DS and Switch.

Language:C++GPL-3.02529 135 428

GodMode9

GodMode9 Explorer - A full access file browser for the Nintendo 3DS console :godmode:

Language:CGPL-3.02082 117 647

CUDA-Learn-Notes

🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记，更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.

Language:CudaGPL-3.0891 10 5

Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookApache-2.0873 7 8

How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

Language:CudaApache-2.0761 12 15

pokeyellow

Disassembly of Pokemon Yellow

Language:Assembly694 42 33

pokegold

Disassembly of Pokémon Gold/Silver

Language:Assembly501 23 24

Awesome-CUDA

This is a list of useful libraries and resources for CUDA development.

LLMSys-PaperList

Large Language Model (LLM) Systems Paper List

pygmtools

A Python Graph Matching Toolkit.

Language:PythonNOASSERTION279 4 20

mirage

A multi-level tensor algebra superoptimizer

Language:C++Apache-2.0268 10 17

3DSident

PSPident clone for 3DS

Language:CZlib265 24 24

Atom

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Language:Cuda221 11 13

BitBLAS

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Language:PythonMIT214 11 18

DistServe

Disaggregated serving system for Large Language Models (LLMs).

Language:Jupyter NotebookApache-2.0188 4 15

vidur

A large-scale simulation framework for LLM inference

Language:PythonMIT136 6 11

Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Language:Cuda93 3 3

ParrotServe

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Language:PythonMIT66 4 2

preble

Stateful LLM Serving

Language:PythonApache-2.016 1 7