vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Home Page:https://docs.vllm.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature]: Build and publish Neuron docker image

yaronr opened this issue Β· comments

πŸš€ The feature, motivation and pitch

It seems like the current docker images don't support Neuron (Inferentia).
It would be very helpful if there was a tested, managed Neuron docker image to use.
While at the same subject, it would be even better if some documentation would be added on running vLlm Neuron using containers.

Alternatives

DJL?

Additional context

No response