ritazh / k8s-distributed-inference

Repository from Github https://github.comritazh/k8s-distributed-inferenceRepository from Github https://github.comritazh/k8s-distributed-inference

k8s-distributed-inference

Command to deploy WebAPI Server on GPU

kubectl patch deployment localai-main
--type='json'
-p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args/2", "value": "--p2ptoken="}, {"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "localai/localai:latest-aio-gpu-nvidia-cuda-12"}]'

Command to deploy Workers with GPU

kubectl patch deployment localai-workers
--type='json'
-p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args/2", "value": "--p2ptoken="}, {"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "localai/localai:latest-aio-gpu-nvidia-cuda-12"}]'

About

License:MIT License