Yu Zhang (yzhangcs)

yzhangcs

Geek Repo

Company: Soochow University

Location:Nara

Home Page:https://yzhang.site

Twitter:@yzhang_cs

Github PK Tool:Github PK Tool


Organizations
SUDA-LA

Yu Zhang's starred repositories

minGPT

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Language:PythonLicense:MITStargazers:19493Issues:255Issues:72

LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Language:PythonLicense:Apache-2.0Stargazers:8135Issues:73Issues:396

lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Language:PythonLicense:Apache-2.0Stargazers:5898Issues:67Issues:269
Language:PythonLicense:MITStargazers:4061Issues:150Issues:33

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonLicense:Apache-2.0Stargazers:947Issues:8Issues:9

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonLicense:Apache-2.0Stargazers:899Issues:14Issues:37

flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Language:CudaLicense:Apache-2.0Stargazers:501Issues:4Issues:4

Rewrite-the-Stars

[CVPR 2024] Rewrite the Stars

Language:PythonLicense:Apache-2.0Stargazers:226Issues:2Issues:17

long-context-attention

Sequence Parallel Attention for Long Context LLM Model Training and Inference

minicons

Utility for behavioral and representational analyses of Language Models

Language:PythonLicense:MITStargazers:110Issues:6Issues:16

DiJiang

[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear attention mechanism.

hippogriff

Griffin MQA + Hawk Linear RNN Hybrid

Language:PythonLicense:MITStargazers:79Issues:4Issues:7

mad-lab

A MAD laboratory to improve AI architecture designs 🧪

Language:PythonLicense:MITStargazers:77Issues:1Issues:2
Language:PythonLicense:Apache-2.0Stargazers:74Issues:3Issues:5
Language:PythonLicense:Apache-2.0Stargazers:71Issues:6Issues:0

Counting-Stars

Counting-Stars (★)

Language:Jupyter NotebookLicense:MITStargazers:67Issues:3Issues:3

flash_attn_jax

JAX bindings for Flash Attention v2

Language:C++License:BSD-3-ClauseStargazers:66Issues:3Issues:7

LASP

Linear Attention Sequence Parallelism (LASP)

Language:PythonLicense:MITStargazers:60Issues:2Issues:0

gpt-accelera

Simple and efficient pytorch-native transformer training and inference (batched)

Language:PythonLicense:BSD-3-ClauseStargazers:46Issues:3Issues:0

GORU-tensorflow

Gated Orthogonal Recurrent Unit implementation in tensorflow

Language:PythonLicense:MITStargazers:35Issues:5Issues:2

rnn-icrag

Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"

Language:PythonStargazers:23Issues:2Issues:0

ParallelTokenizer

Use the tokenizer in parallel to achieve superior acceleration

Language:PythonLicense:Apache-2.0Stargazers:12Issues:3Issues:1

based-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonLicense:MITStargazers:8Issues:0Issues:0
Language:PythonLicense:BSD-3-ClauseStargazers:7Issues:0Issues:0
Language:PythonStargazers:5Issues:1Issues:0

LLMTest_NeedleInAHaystack_HFModel

Support huggingface model to do simple retrieval from LLM models at various context lengths to measure accuracy

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:3Issues:0Issues:0
Language:PythonLicense:MITStargazers:2Issues:1Issues:0

seq-test

Understand and test language model architectures on synthetic tasks.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0