jiashu-z

followers

following

stars

Jiashu's starred repositories

LLM101n

LLM101n: Let's build a Storyteller

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookApache-2.0148100

matplotlib-curly-brace

Plot curly brace with matplotlib

Language:HTMLMIT4400

Paper-Picture-Writing-Code

MLNLP: Paper Picture Writing Code

Language:TeX98500

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonApache-2.093700

allo

Allo: A Programming Model for Composable Accelerator Design

Language:PythonApache-2.09600

llama3-from-scratch

llama3 implementation one matrix multiplication at a time

Language:Jupyter NotebookMIT1075600

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

MIT52900

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.01993400

GPUDB-Prefetch

Source code of our DaMoN@SIGMOD 2024 paper "How Does Software Prefetching Work on GPU Query Processing?"

Language:Cuda500

ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Language:PythonApache-2.0557600

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT280800

torchtitan

A native PyTorch Library for large model training

Language:PythonBSD-3-Clause126000

DeepSeek-LLM

DeepSeek LLM: Let there be answers

Language:MakefileMIT132200

BurstGPT

A GPT-3.5 & GPT-4 Workload Trace to Optimize LLM Serving Systems

Language:PythonCC-BY-4.08200

adaptive-retrieval

Language:PythonMIT14900

Vexless

A code base for Vexless

Language:PythonMIT400

Adaptive-RAG

Language:JsonnetApache-2.011200

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.03551100

GEAR

GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM

Language:PythonMIT11400

retro

Official repo to On the Generalization Ability of Retrieval-Enhanced Transformers

Language:PythonApache-2.03300

llama_index

LlamaIndex is a data framework for your LLM applications

Language:PythonMIT3312100

ollama

Get up and running with Llama 3, Mistral, Gemma 2, and other large language models.

Language:GoMIT7612400

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Language:PythonMIT158800

contriever

Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning

Language:PythonNOASSERTION62900

Verba

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

Language:PythonBSD-3-Clause479300

RETRO-pytorch

Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Language:PythonApache-2.084500

pytorch-model-train-template

pytorch单精度、半精度、混合精度、单卡、多卡（DP / DDP）、FSDP、DeepSpeed模型训练代码，并对比不同方法的训练速度以及GPU内存的使用

Language:Python4600

transformer_vq

Language:Python15200

AdaQP

Adaptive Message Quantization and Parallelization for Distributed Full-graph GNN Training

Language:PythonMIT1800