qipengwang

qipengwang

Geek Repo

Company:Peking University

Home Page:http://qipengwang.github.io/

Github PK Tool:Github PK Tool

qipengwang's starred repositories

interview

📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.

Language:C++License:NOASSERTIONStargazers:34574Issues:863Issues:63

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6582Issues:63Issues:80

ERNIE

Official implementations for various pre-training models of ERNIE-family, covering topics of Language Understanding & Generation, Multimodal Understanding & Generation, and beyond.

Automatic_ticket_purchase

大麦网抢票脚本

Language:PythonLicense:MITStargazers:4278Issues:20Issues:86

AIOS

AIOS: LLM Agent Operating System

Language:PythonLicense:MITStargazers:3272Issues:49Issues:31

LyCORIS

Lora beYond Conventional methods, Other Rank adaptation Implementations for Stable diffusion.

Language:PythonLicense:Apache-2.0Stargazers:2167Issues:20Issues:140

FreeU

FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:1198Issues:16Issues:106

U-ViT

A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".

Language:Jupyter NotebookLicense:MITStargazers:896Issues:12Issues:28

LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

Language:PythonLicense:Apache-2.0Stargazers:832Issues:11Issues:76

kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:797Issues:28Issues:103

paint-with-words-sd

Implementation of Paint-with-words with Stable Diffusion : method from eDiff-I that let you generate image from text-labeled segmentation map.

Language:Jupyter NotebookLicense:MITStargazers:636Issues:23Issues:33

ScaleCrafter

[ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.

kmeans_pytorch

kmeans using PyTorch

Language:Jupyter NotebookLicense:MITStargazers:472Issues:7Issues:37

mllm

Fast Multimodal LLM on Mobile Devices

Language:C++License:MITStargazers:427Issues:16Issues:38

qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Language:PythonLicense:Apache-2.0Stargazers:406Issues:9Issues:30

APoT_Quantization

PyTorch implementation for the APoT quantization (ICLR 2020)

Awesome_Multimodel_LLM

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.

Keras-DDPM

生成扩散模型的Keras实现

multi-lora-fine-tune

Provide Efficient LLM Fine-Tune via Multi-LoRA Optimization

Language:PythonLicense:Apache-2.0Stargazers:190Issues:3Issues:42

LongQLoRA

LongQLoRA: Extent Context Length of LLMs Efficiently

GEAR

GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM

Language:PythonLicense:MITStargazers:137Issues:1Issues:19

MoEBERT

This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

Language:PythonLicense:Apache-2.0Stargazers:97Issues:1Issues:6

SVD-LLM

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Language:PythonLicense:Apache-2.0Stargazers:88Issues:7Issues:12

Awesome-Resource-Efficient-LLM-Papers

a curated list of high-quality papers on resource-efficient LLMs 🌱

License:CC0-1.0Stargazers:70Issues:5Issues:0

prepacking

The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"

Language:Jupyter NotebookStargazers:57Issues:2Issues:1

spatten-llm

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Language:ScalaLicense:MITStargazers:55Issues:8Issues:1

torch_kmeans

PyTorch implementations of KMeans, Soft-KMeans and Constrained-KMeans which can be run on GPU and work on (mini-)batches of data.

Language:PythonLicense:MITStargazers:51Issues:3Issues:9

Melon

MobiSys#114

Language:C++Stargazers:21Issues:0Issues:0