Chaojian Li's starred repositories

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26458Issues:219Issues:245

starter-workflows

Accelerating new GitHub Actions workflows

Language:TypeScriptLicense:NOASSERTIONStargazers:9037Issues:457Issues:485

tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

Language:SystemVerilogStargazers:6965Issues:68Issues:23

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5574Issues:63Issues:98

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:5451Issues:103Issues:1078

arxiv-latex-cleaner

arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv

Language:PythonLicense:Apache-2.0Stargazers:5232Issues:33Issues:52

dust3r

DUSt3R: Geometric 3D Vision Made Easy

Language:PythonLicense:NOASSERTIONStargazers:5108Issues:51Issues:154

warp

A Python framework for high performance GPU simulation and graphics

Language:PythonLicense:NOASSERTIONStargazers:4150Issues:53Issues:228

FluidX3D

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs via OpenCL. Free for non-commercial use.

Language:C++License:NOASSERTIONStargazers:3837Issues:49Issues:174

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

pbrt-v4

Source code to pbrt, the ray tracer described in the forthcoming 4th edition of the "Physically Based Rendering: From Theory to Implementation" book.

Language:C++License:Apache-2.0Stargazers:2831Issues:69Issues:324

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

LGM

[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

Language:PythonLicense:MITStargazers:1591Issues:34Issues:69

cs249r_book

Collaborative book Machine Learning Systems

Language:TeXLicense:NOASSERTIONStargazers:1005Issues:13Issues:227

TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

Language:PythonLicense:NOASSERTIONStargazers:457Issues:14Issues:73

tinymembench

Simple benchmark for memory throughput and latency

Grendel-GS

Ongoing research training gaussian splatting at scale by distributed system

Language:PythonLicense:Apache-2.0Stargazers:338Issues:17Issues:23
Language:C++License:Apache-2.0Stargazers:239Issues:8Issues:133

KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Language:PythonLicense:MITStargazers:218Issues:5Issues:24

nerfbaselines

Reproducible evaluation of NeRF methods

Language:PythonLicense:MITStargazers:152Issues:3Issues:9
Language:PythonLicense:MITStargazers:135Issues:4Issues:0
Language:PythonLicense:Apache-2.0Stargazers:99Issues:4Issues:8

ShiftAddLLM

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Language:PythonLicense:Apache-2.0Stargazers:84Issues:3Issues:5

Magicube

Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.

Language:C++License:GPL-3.0Stargazers:80Issues:4Issues:2

Edge-LLM

[DAC 2024] EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting

ACT

[ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration

Language:PythonStargazers:16Issues:0Issues:0
Language:PythonLicense:MITStargazers:4Issues:0Issues:0

3D-Carbon

3D-Carbon: An Analytical Carbon Modeling Tool for 3D and 2.5D Integrated Circuits

Language:Jupyter NotebookStargazers:4Issues:0Issues:0

LogarithmicPosit

[DAC'24] Official Implementation of the Logarithmic Posit (LP) Number System

License:MITStargazers:2Issues:0Issues:0