A high-throughput and memory-efficient inference and serving engine for LLMs
Home Page:https://docs.vllm.ai
Repository from Github https://github.comh2oai/vllm-joegRepository from Github https://github.comh2oai/vllm-joeg