Jinyu Bai's starred repositories
matmulfreellm
Implementation for MatMul-free LM.
HolisticTraceAnalysis
A library to analyze PyTorch traces.
LLaMA3-Quantization
A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..
AutoSmoothQuant
An easy-to-use package for implementing SmoothQuant for LLMs
Awesome-LLM-Quantization
Awesome list for LLM quantization
ShiftAddLLM
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Floating-Point-Adder
32 bit pipelined binary floating point adder using IEEE-754 Single Precision Format in Verilog
BGEMM-CUDA
This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!
retraining-free-quantization
RFQuant: Retraining-free Model Quantization via One-Shot Weight-Coupling Learning, CVPR (2024)
Ansor-AF-DS
This repository contains the figures, tables data and source code in the paper ICS'24: "Accelerated Auto-Tuning of GPU Kernels for Tensor Computations".