UranusSeven / Effective-LLM-Inference-Evaluation

A project aimed at measuring the real-world performance of Large Language Model (LLM) inference frameworks, inspired by the concepts in deepspeed-fastgen.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UranusSeven/Effective-LLM-Inference-Evaluation Issues

No issues in this repository yet.