stefanprodan / swarmprom

Docker Swarm instrumentation with Prometheus, Grafana, cAdvisor, Node Exporter and Alert Manager

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prometheus only getting metrics from manager node

colehertz opened this issue · comments

I'm new to using Prometheus and I would really appreciate some help. I've been looking into this issue for quite a bit. I have a swarm of machines with 1 manager and 7 workers. The manager is on a digital ocean instance and the workers are physical machines on my local network.

The problem is when I go to the Grafana dashboard only 1 node is being detected. When I visit the prometheus targets url at port 9090, I see 8 endpoints but only 1 is up. The rest have an error that says "context_deadline_exceeded".

On each machine, I have set the metrics address to 0.0.0.0:9323 and experimental mode is set to true. I have also enabled port 2376 on the machines, 7946, and 4789.

Any suggestions to get metrics for the other nodes is much appreciated. Thank you!

any news on this issue? Have you managed to solve this or find an alternative?

This problem took my three days. On all nodes, you must allow the necessary ports that the applications use or "sudo ufw disable".

I have exactly the same problem.

Screenshot of Prometheus

@colehertz Have you found a solution?

I guess one reason could be, that my worker nodes need to join the swam after swarmprom initialization on the master node..
UFW is disabled.
Docker v.18+ won't be the problem?

$ docker version
Client:
 Version:           18.09.7
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        2d0083d
 Built:             Thu Jun 27 17:57:09 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.7
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       2d0083d
  Built:            Thu Jun 27 17:23:02 2019
  OS/Arch:          linux/amd64
  Experimental:     true

I found my problem: The overlay network swarmprom_net was not working correctly. This is not an issue of misconfiguration in this project, but during swarm creation: I used a floating ip for swarm advertisement. This seems to be an unsolved bug.

Using the private ip in the swarm init command solved the issue for me.

I found my problem: The overlay network swarmprom_net was not working correctly. This is not an issue of misconfiguration in this project, but during swarm creation: I used a floating ip for swarm advertisement. This seems to be an unsolved bug.

Using the private ip in the swarm init command solved the issue for me.

Guys use PRIVATE IP in swarm init
you saved my day, Thanks