micmarty / TensorHive

Lightweight computing resource management tool for distributed TensorFlow executed in clusters with SSH connectivity, based on KernelHive manager framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TensorHive

Resource management tool for executing distributed TensorFlow based on KernelHive manager libraries.

GPU monitoring example:

  1. Install KernelHive python libraries:

pip install kernelhive

  1. Run the monitoring daemon:

python gpu_monitoring.py

  1. Add nodes that should be monitored:

cd scripts ./add_node.sh

About

Lightweight computing resource management tool for distributed TensorFlow executed in clusters with SSH connectivity, based on KernelHive manager framework


Languages

Language:Python 72.4%Language:Shell 27.6%