triton-inference-server / triton_cli

Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Documentation suggestion

IAINATDBI opened this issue · comments

It might be worth adding a note that when serving an LLM from the CLI within a container, the triton start command does not return and you therefore need to launch a new shell using docker exec, in order to interact with any infer commands. This might be obvious, but would help understanding the overall process.

I did this successfully. I really like the tensor detail coming back from the infer command.

Cheers

Hi @IAINATDBI, thanks for calling this out. We'll try to improve this clarification.