Panpan XU (lliquid)

lliquid

Geek Repo

Company:Amazon AWS AI

Location:Sunnyvale, CA

Home Page:https://lliquid.github.io/homepage

Github PK Tool:Github PK Tool

Panpan XU's starred repositories

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:25114Issues:278Issues:77

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:11595Issues:168Issues:229

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:8007Issues:87Issues:1740

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:6232Issues:35Issues:1021

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5905Issues:47Issues:78

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5473Issues:64Issues:97

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:4587Issues:45Issues:433

tiny-cuda-nn

Lightning fast C++/CUDA neural network framework

Language:C++License:NOASSERTIONStargazers:3628Issues:50Issues:381

mamba-minimal

Simple, minimal implementation of the Mamba SSM in one file of PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:2513Issues:23Issues:26

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2155Issues:32Issues:85

Awesome-Text2SQL

Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL、Text2API、Text2Vis and more.

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonLicense:MITStargazers:1160Issues:21Issues:86

streaming

A Data Streaming Library for Efficient Neural Network Training

Language:PythonLicense:Apache-2.0Stargazers:1062Issues:20Issues:160

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:1031Issues:15Issues:91

blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Language:CudaLicense:MITStargazers:1020Issues:199Issues:48

extension-cpp

C++ extensions in PyTorch

dolma

Data and tools for generating and inspecting OLMo pre-training data.

Language:PythonLicense:Apache-2.0Stargazers:889Issues:18Issues:67

safari

Convolutions for Sequence Modeling

Language:AssemblyLicense:Apache-2.0Stargazers:858Issues:35Issues:38

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonLicense:Apache-2.0Stargazers:830Issues:8Issues:18

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonLicense:Apache-2.0Stargazers:528Issues:15Issues:26

megalodon

Reference implementation of Megalodon 7B model

Language:CudaLicense:MITStargazers:500Issues:14Issues:7

NVTX

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

Language:CLicense:Apache-2.0Stargazers:274Issues:12Issues:35

QuaRot

Code for QuaRot, an end-to-end 4-bit inference of large language models.

Language:PythonLicense:Apache-2.0Stargazers:241Issues:11Issues:36

optimum-neuron

Easy, fast and very cheap training and inference on AWS Trainium and Inferentia chips.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:189Issues:13Issues:264

CrossNER

CrossNER: Evaluating Cross-Domain Named Entity Recognition (AAAI-2021)

Language:PythonLicense:MITStargazers:120Issues:4Issues:10

zeno-hub

AI Evaluation Platform

Language:CSSLicense:MITStargazers:37Issues:2Issues:10

hyena

JAX/Flax implementation of the Hyena Hierarchy

Language:Jupyter NotebookLicense:MITStargazers:29Issues:4Issues:0
Language:C++Stargazers:10Issues:0Issues:0

PromptNER

Prompting For Named Entity Recognition