Yudi Zhang (YudiZh)

YudiZh

Geek Repo

Company:Harbin Institute of Technology

Location:Harbin

Home Page:https://www.hit.edu.cn/

Github PK Tool:Github PK Tool

Yudi Zhang's starred repositories

LLM101n

LLM101n: Let's build a Storyteller

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23842Issues:221Issues:3653

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:19263Issues:297Issues:1340

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:12050Issues:135Issues:197

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10922Issues:88Issues:300

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:10309Issues:107Issues:18

Yi

A series of large language models trained from scratch by developers @01-ai

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7520Issues:111Issues:289

OpenAgents

[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild

Language:PythonLicense:Apache-2.0Stargazers:3807Issues:42Issues:98

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2058Issues:34Issues:79

MotionGPT

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs

Language:PythonLicense:MITStargazers:1398Issues:48Issues:92

the-art-of-debugging

The Art of Debugging

Language:CLicense:CC-BY-SA-4.0Stargazers:764Issues:16Issues:0

EAGLE

Official Implementation of EAGLE-1 and EAGLE-2

Language:PythonLicense:Apache-2.0Stargazers:682Issues:12Issues:96

PiPPy

Pipeline Parallelism for PyTorch

Language:PythonLicense:BSD-3-ClauseStargazers:677Issues:37Issues:255

fairseq2

FAIR Sequence Modeling Toolkit 2

Language:PythonLicense:MITStargazers:638Issues:18Issues:97

DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"

ALMA

State-of-the-art LLM-based translation models.

Language:RubyLicense:MITStargazers:366Issues:12Issues:51

ChunkLlama

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:304Issues:7Issues:20

Transformer-M

[ICLR 2023] One Transformer Can Understand Both 2D & 3D Molecular Data (official implementation)

Language:PythonLicense:MITStargazers:197Issues:6Issues:21
Language:PythonLicense:NOASSERTIONStargazers:192Issues:4Issues:6

CapsFusion

[CVPR 2024] CapsFusion: Rethinking Image-Text Data at Scale

Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Language:PythonLicense:Apache-2.0Stargazers:127Issues:1Issues:11

aligner

Achieving Efficient Alignment through Learned Correction

DiJiang

[ICML'24 Oral] The official code of "DiJiang: Efficient Large Language Models through Compact Kernelization", a novel DCT-based linear attention mechanism.

linear_open_lm

A repository for research on medium sized language models.

Language:PythonLicense:MITStargazers:69Issues:0Issues:0

HGRN

[NeurIPS 2023 spotlight] Official implementation of HGRN in our NeurIPS 2023 paper - Hierarchically Gated Recurrent Neural Network for Sequence Modeling

csl

[Preprint] Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

Language:PythonLicense:NOASSERTIONStargazers:14Issues:2Issues:0