Yangyu Zhang (LinkZyy)

LinkZyy

Geek Repo

Company:UCAS

Location:BeiJing

Github PK Tool:Github PK Tool

Yangyu Zhang's starred repositories

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:986Issues:0Issues:0

vattention

Dynamic Memory Management for Serving LLMs without PagedAttention

Language:CLicense:MITStargazers:170Issues:0Issues:0

DietCode

DietCode Code Release

Language:CudaStargazers:60Issues:0Issues:0

cuda_ioctl_sniffer

Sniff CUDA ioctls

Language:CStargazers:172Issues:0Issues:0

AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Language:PythonLicense:MITStargazers:1558Issues:0Issues:0

Retrieval_Head

open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality

Language:PythonStargazers:131Issues:0Issues:0

academicpages.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:11367Issues:0Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49330Issues:0Issues:0

FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving

Language:C++License:Apache-2.0Stargazers:1619Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:21254Issues:0Issues:0

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:90785Issues:0Issues:0

DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Language:PythonLicense:NOASSERTIONStargazers:5873Issues:0Issues:0

sglang

SGLang is yet another fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:4181Issues:0Issues:0
Language:Jupyter NotebookStargazers:9Issues:0Issues:0

triton

Development repository for the Triton language and compiler

Language:C++License:MITStargazers:12254Issues:0Issues:0

d2l-zh

《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。

Language:PythonLicense:Apache-2.0Stargazers:60496Issues:0Issues:0

StreamDiffusion

StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation

Language:PythonLicense:Apache-2.0Stargazers:9391Issues:0Issues:0

SDL

Simple Directmedia Layer

Language:CLicense:ZlibStargazers:9032Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:252Issues:0Issues:0

llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:2241Issues:0Issues:0

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2131Issues:0Issues:0

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonLicense:Apache-2.0Stargazers:923Issues:0Issues:0

MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Language:PythonLicense:Apache-2.0Stargazers:11153Issues:0Issues:0

LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Language:PythonStargazers:462Issues:0Issues:0

S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Language:PythonLicense:Apache-2.0Stargazers:1666Issues:0Issues:0

MQuAKE

[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

Language:Jupyter NotebookLicense:MITStargazers:89Issues:0Issues:0

ACL2023-Retrieval-LM.github.io

https://acl2023-retrieval-lm.github.io/

Language:JavaScriptStargazers:150Issues:0Issues:0

LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Language:PythonStargazers:9823Issues:0Issues:0

generative_agents

Generative Agents: Interactive Simulacra of Human Behavior

License:Apache-2.0Stargazers:16078Issues:0Issues:0