KiritoHugh / Awesome-DB4LLM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome-DB4LLM

Inference system

  • [ SOSP 2023 ] Efficient Memory Management for Large Language Model Serving with PagedAttention (vllm) [paper] [project]
  • [ ICML 2023 ] FlexGen: High-throughput Generative Inference of Large Language Models with a Single GPU [paper] [project]

About