zhuyiche / FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

zhuyiche/FlexGen Stargazers