Jiarui Fang's repositories
LLMSpeculativeSampling
Fast inference from large lauguage models via speculative decoding
long-context-attention
Sequence Parallel Attention for Long Context LLM Model Training and Inference
LLMRoofline
Compare different hardware platforms via the Roofline Model for LLM inference tasks.
PyTorchMemTracer
Depict GPU memory footprint during DNN training of PyTorch
crack_leetcode
五天刷题,三天模拟!快速掌握leetcode解题套路!
LLM-Viewer
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
ssh-passwd-free
Method to set passwd-free for a set of IPs
TensorrtBenchmark
Benchmark bert using TensorRT
ColossalAI
Colossal-AI: A Unified Deep Learning System for Big Model Era
TensorNVMe
A Python library transfers PyTorch tensors between CPU and NVMe
ring-flash-attention
Ring attention implementation with flash attention
BM-Training
Dive into Big Model Training
chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
Conchmark
A benchmark liberary for Colossal-AI.
EnergonAI
Large-scale model inference.
FreqCacheEmbedding
A memory efficient DLRM training solution using ColossalAI
leptonai
A Pythonic framework to simplify AI service building
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.