qipengwang's starred repositories
interview
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, including language, program library, data structure, algorithm, system, network, link loading library, interview experience, recruitment, recommendation, etc.
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Automatic_ticket_purchase
大麦网抢票脚本
flashinfer
FlashInfer: Kernel Library for LLM Serving
LLM-Pruner
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
paint-with-words-sd
Implementation of Paint-with-words with Stable Diffusion : method from eDiff-I that let you generate image from text-labeled segmentation map.
ScaleCrafter
[ICLR 2024 Spotlight] Official implementation of ScaleCrafter for higher-resolution visual generation at inference time.
kmeans_pytorch
kmeans using PyTorch
APoT_Quantization
PyTorch implementation for the APoT quantization (ICLR 2020)
Awesome_Multimodel_LLM
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.
Keras-DDPM
生成扩散模型的Keras实现
multi-lora-fine-tune
Provide Efficient LLM Fine-Tune via Multi-LoRA Optimization
Awesome-Resource-Efficient-LLM-Papers
a curated list of high-quality papers on resource-efficient LLMs 🌱
prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"
spatten-llm
[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
torch_kmeans
PyTorch implementations of KMeans, Soft-KMeans and Constrained-KMeans which can be run on GPU and work on (mini-)batches of data.