IQTLabs / torchserve

Torchserve inference containers for multiple platforms.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TorchServe

This repository provides Docker containers for TorchServe (an inference server for PyTorch models) for multiple hardware platforms.

For an example use, see TorchServe's examples.

Instead of starting torchserve as given in the example, start it with a docker command, as follows for your platform (where your model is stored as a .mar file, in a model_store subdirectory of the directory where you are starting torchserve).

Platforms

docker run -v $(pwd)/model_store:/model_store -p 8080:8080 --rm --name torchserve -d iqtlabs/torchserve --models <model>=<model>.mar

docker run --gpus all -v $(pwd)/model_store:/model_store -p 8080:8080 --rm --name torchserve -d iqtlabs/cuda-torchserve --models <model>=<model>.mar

docker run --runtime nvidia -v $(pwd)/model_store:/model_store -p 8080:8080 --rm --name torchserve -d iqtlabs/orin-torchserve --models <model>=<model>.mar

Apple MPS support

Currently, Docker does not support access to Apple MPS devices, so inference will be CPU only. However, PyTorch itself does support MPS, and so TorchServe could be run with MPS support outside a Docker container.

About

Torchserve inference containers for multiple platforms.

License:Apache License 2.0


Languages

Language:Shell 100.0%