elastic-ai / elastic-gpu-agent

elastic-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Elastic GPU Agent

Elastic GPU Agent is a Kubernetes device plugin implement for gpu allocation and use in container. It runs as a Daemonset in Kubernetes node. It works as follows:

  • Register gpu core and memory resources on node
  • Allocate and share gpu resources for containers
  • Support gpu resources qos and isolation with specific gpu driver(e.g. elastic gpu)

For the complete solution and further details, please refer to Elastic GPU Scheduler.

Prerequisites

  • Kubernetes v1.17+
  • golang 1.16+
  • NVIDIA drivers
  • nvidia-docker
  • set nvidia as docker default-runtime: add "default-runtime": "nvidia" to /etc/docker/daemon.json, and restart docker daemon.

Build Image

Run make or TAG=<image-tag> make to build elastic-gpu-agent image

Getting Started

Deploy Elastic GPU Agent as follows:

$ kubectl apply -f deploy/elastic-gpu-agent.yaml

You can find more details on Elastic GPU Scheduler.

License

Distributed under the Apache License.

About

elastic-gpu-agent is a Kubernetes device plugin for GPU resources allocation on node.

License:Apache License 2.0


Languages

Language:Go 91.3%Language:Starlark 3.2%Language:C 3.1%Language:Dockerfile 1.3%Language:Shell 0.6%Language:Makefile 0.5%