Grafana Dashboard for Swarm Nodes contains incorrect query.

Question

Grafana Dashboard for Swarm Nodes contains incorrect query.

darkl0rd opened this issue 5 years ago · comments

The "Containers network traffic by Node" returns an error when Prometheus was restarted in the meantime and ends up with a different value for "instance".

The queries used on the dashboard are:

sum(rate(container_network_receive_bytes_total{container_label_com_docker_swarm_node_id=~"$node_id"}[$interval])* on(container_label_com_docker_swarm_node_id) group_left(node_name) node_meta) by (node_name)

- sum(rate(container_network_transmit_bytes_total{container_label_com_docker_swarm_node_id=~"$node_id"}[$interval]) * on(container_label_com_docker_swarm_node_id) group_left(node_name) node_meta) by (node_name)

I have been trying to rewrite the query to get it to work, without any success.

Reproduction case:

Run Prometheus
Collect data for x minutes.
Restart prometheus.

When the restart is within the scope of the timeline you select in Grafana, the graph will error with a
"multiple series error".

darkl0rd · Answer 1 · Mon Feb 18 2019 14:35:50 GMT+0800 (China Standard Time)

sum by(node_name) (irate(container_network_receive_bytes_total[$interval]) * on(container_label_com_docker_swarm_node_id) group_left(node_name) sum without(instance) (node_meta{node_id=~"$node_id"}))