Colmet - Collecting metrics about jobs running in a distributed environnement
Colmet is a monitoring tool to collect metrics about jobs running in a distributed environnement, especially for gathering metrics on clusters and grids. It provides currently several backends :
- taskstats: fetch task metrics from the linux kernel
- stdout: display the metrics on the terminal
- zeromq: transport the metrics across the network
- hdf5: store the metrics on the filesystem
- a Linux kernel that supports
- Taskstats
- Python Version 2.7 or newer
- python-zmq 2.2.0 or newer
- python-tables 3.3.0 or newer
- python-pyinotify 0.9.3-2 or newer
You can install, upgrade, uninstall colmet with these commands:
$ pip install [--user] colmet $ pip install [--user] --upgrade colmet $ pip uninstall colmet
Or from git (last development version):
$ pip install [--user] git+https://github.com/oar-team/colmet.git
Or if you already pulled the sources:
$ pip install [--user] path/to/sources
Or if you don't have pip:
$ easy_install colmet
for the nodes :
sudo colmet-node -vvv --zeromq-uri tcp://127.0.0.1:5556
for the collector :
colmet-collector -vvv --zeromq-bind-uri tcp://127.0.0.1:5556 --hdf5-filepath /data/colmet.hdf5 --hdf5-complevel 9
You will see the number of counters retrieved in the debug log.
For more information, please refer to the help of theses scripts (--help)
This product is distributed under the GNU General Public License Version2. Please read through the file LICENSE for more information about our license.