ShuoTang123's repositories

MATRIX

Implementation of the MATRIX framework (ICML 2024)

Language:PythonStargazers:36Issues:2Issues:0
Language:PythonLicense:MITStargazers:2Issues:0Issues:0

ContrastiveDecoding

contrastive decoding

Language:PythonStargazers:1Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

Awesome-LLM-Inference

📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.

License:GPL-3.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

License:Apache-2.0Stargazers:0Issues:0Issues:0

social-media-profile-scrapers

Fetch user's data across social media

License:Apache-2.0Stargazers:0Issues:0Issues:0