stefanprodan / swarmprom

Docker Swarm instrumentation with Prometheus, Grafana, cAdvisor, Node Exporter and Alert Manager

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prometheus.yml needs to be pulled into Docker Configs

Illydth opened this issue · comments

Generally speaking, SwarmProm is a great starting point. One issue we're running into implementing this solution, however, is that (at this point) there is no way to extend prometheus metrics to other things.

For instance, we would like to monitor Traefik (BTW as an Aside, you should look at replacing Caddy with Traefik in your stack...In my opinion, it's an easier to configure traffic router than Caddy, with less random config files...YMMV) with Prometheus.

However, when I go to pull prometheus.yml out (create a docker config file for it, add that config into the monitoring stack file) upon starting prometheus we're getting:

"mv: can't rename '/tmp/prometheus.yml': Device or resource busy"

Meaning prometheus appears to already be running by the time Docker attempts to mount the prometheus.yml file into /etc/prometheus.

The only way to add to the scrape configs at this point is to download your Dockerfile / prometheus.yml file and re-build the prometheus container...so the prometheus included in this stack cannot really be extended to monitor other things.

Help a guy out? There's got to be a way to externalize the prometheus.yml file so that it can come in from docker configs (like the rules files do).

I've just hit the same point trying to set up an snmp-exporter and get it scraped. This is a great starting point but the learning curve is steep!

Yes I did, but snmp-exporter requires further configuration to include the host IP that gets scraped so I have presumed I can't use that.

https://github.com/prometheus/snmp_exporter#prometheus-configuration

@Illydth @stefanprodan does your issue with traefik is resolved? Can we connect traefik in different stack to this swarmprom stack?
My traefik service ie working fine but i could not see any data in dashboard thought service name and port is added in prometheus.yaml using entrypoint.sh

if you curl traefik at /metrics from the mon stack does it work?

@stefanprodan Are you planning to integrate your swarmprom with hipchat?
its working well with slack but we want to integrate it with hipchat. Infact i have already configured it for hipchat and we are already getting hipchat alerts but its not in good readable format.

image

@stefanprodan It'd also be advantageous to have prometheus.yml pulled into docker configs so someone can change scrape_interval and evaluation_interval.

I too think there is no harm in making the Prometheus.yml as a config. Users don't need to change it unless needed but they can always change if if required. Will do no harm.

+1 as docker config

I vote for config too, but for the meantime I'd like to make it work the other way around.
What I've done is:

  • start traefik with the metrics activated -> if I go to http://10.1.1.124:8181/metrics I see them
  • start prometheus adding to the docker-compose.yml this lines:
    environment:
    - JOBS=10.1.1.124:8181
  • this is the env inside prometheus:
/prometheus $ env
JOBS=10.1.1.124:8181
HOSTNAME=0010ea9f96f7
SHLVL=1
HOME=/home
TERM=xterm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
WEAVE_TOKEN=none
PWD=/prometheus
  • check that the prometheus container can see the metrics:
/prometheus $ wget http://10.1.1.124:8181/metrics
Connecting to 10.1.1.124:8181 (10.1.1.124:8181)
metrics              100% |*******************************| 11714   0:00:00 ETA
/prometheus $ ls -ltr
total 20
-rw-------    1 nobody   nogroup          2 Jun 28 14:29 lock
drwxr-xr-x    2 nobody   nogroup         48 Jun 28 14:29 wal
-rw-r--r--    1 nobody   nogroup      11714 Jun 28 14:50 metrics
/prometheus $ cat metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 6.2989e-05
go_gc_duration_seconds{quantile="0.25"} 0.000144349
go_gc_duration_seconds{quantile="0.5"} 0.000203901

Yet when I import a dashboard inside grafana (for example the one with id 2240), I see all empty tabs :(

+1 for prometheus.yml as a Docker config

anyone been able to solve this without having to rebuild and maintain a new container?

Im hitting the same issues, with the jenkins exporter for instance. It defaults to

metrics_path: /prometheus, and it wont allow being changed to /metrics because it conficts with the jenkins metric plugin. Id love to avoid building/testing/mainting this image just over that.

Any news? been hitting the same wall.. prometheus.yml as a docker config seems like a must IMHO

Update: I managed to use this guy's images which is based on stefan's image with slight changes to the entrypoint.sh which bypasses the issue without the need to build your own Dockerfile.

https://hub.docker.com/r/prom/prometheus
https://hub.docker.com/r/prom/alertmanager

Any updates regarding this issue ?
EDIT:
I actually figured it out by trial and error. It is absolutely possible to pull prometheus.yml into docker configs as of v2.5.0.

I edited the top of the docker-compose file like so:

  dockerd_config:
    file: ./dockerd-exporter/Caddyfile
  node_rules:
    file: ./prometheus/rules/swarm_node.rules.yml
  task_rules:
    file: ./prometheus/rules/swarm_task.rules.yml
  promfile:
    file: ./prometheus/conf/prometheus.yml

and at the bottom I added the config file

  prometheus:
    image: stefanprodan/swarmprom-prometheus:v2.5.0
    networks:
      - default
      - net
      - traefik-public
    command:
      - '--web.enable-lifecycle'
->      - '--config.file=/etc/prometheus/prometheus_custom.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention=${PROMETHEUS_RETENTION:-24h}'
    volumes:
      - prometheus:/prometheus
    configs:
      - source: promfile
->        target: /etc/prometheus/prometheus_custom.yml
      - source: node_rules
        target: /etc/prometheus/swarm_node.rules.yml
      - source: task_rules
        target: /etc/prometheus/swarm_task.rules.yml

Note that I had to change the path to the prometheus.yml in the commands, providing the config to the default path conflicts with the images mechanism and crashes.

I recommend this to be closed after an edit to the docs :)

can you transmit us your repo with your docker compose please ?

@AKIA-Kali I'm sorry it's not public, but you should be able to achieve it using the code I provided; you may need to run docker compose twice if it fails the first time. Also, keep in mind that to update docker configs you need to disconnect it from any running containers, delete it, and then recreate it with the updated data.
https://docs.docker.com/engine/swarm/configs/