FMInference / FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FMInference/FlexLLMGen Stargazers