Prometheus container is continuously restarting "Received SIGTERM, exiting gracefully..."
gauravgoyal0086 opened this issue · comments
I am using this repo to create monitoring stack for our production swarm environments.
Have made some changes in prometheus configuration
Can you please help me to fix this problem.
- removed docker-enterypoint.sh
- Attached herewith my prometheus.yaml file
- Attached herewith prometheus dockerfile
- Modified docker-compose.yml
Share whole code @ https://codeshare.io/5gb8My
I could deploy all services except getting below error on prometheus container
`deb795407a (none))"
level=info ts=2018-03-07T17:07:38.10631854Z caller=main.go:228 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2018-03-07T17:07:38.109652503Z caller=main.go:502 msg="Starting TSDB ..."
level=info ts=2018-03-07T17:07:38.127573843Z caller=web.go:383 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-03-07T17:07:38.574693038Z caller=main.go:512 msg="TSDB started"
level=info ts=2018-03-07T17:07:38.574933556Z caller=main.go:588 msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
level=info ts=2018-03-07T17:07:38.578334416Z caller=main.go:489 msg="Server is ready to receive web requests."
level=warn ts=2018-03-07T17:08:05.313728189Z caller=main.go:366 msg="Received SIGTERM, exiting gracefully..."
level=info ts=2018-03-07T17:08:05.313788495Z caller=main.go:390 msg="Stopping scrape discovery manager..."
level=info ts=2018-03-07T17:08:05.3138142Z caller=main.go:403 msg="Stopping notify discovery manager..."
level=info ts=2018-03-07T17:08:05.313828264Z caller=main.go:427 msg="Stopping scrape manager..."
level=info ts=2018-03-07T17:08:05.313855348Z caller=main.go:386 msg="Scrape discovery manager stopped"
level=info ts=2018-03-07T17:08:05.313893078Z caller=main.go:399 msg="Notify discovery manager stopped"
level=info ts=2018-03-07T17:08:05.31401654Z caller=main.go:421 msg="Scrape manager stopped"
level=info ts=2018-03-07T17:08:05.317560586Z caller=manager.go:460 component="rule manager" msg="Stopping rule manager..."
level=info ts=2018-03-07T17:08:05.317627258Z caller=manager.go:466 component="rule manager" msg="Rule manager stopped"
level=info ts=2018-03-07T17:08:05.31764061Z caller=notifier.go:493 component=notifier msg="Stopping notification manager..."
level=info ts=2018-03-07T17:08:05.317659353Z caller=main.go:573 msg="Notifier manager stopped"
level=info ts=2018-03-07T17:08:05.317714607Z caller=main.go:584 msg="See you next time!"`
`docker@manager:/Users/gaurav.goyal/gg/swarmprom/prometheus/conf$ cat prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: 'promswarm'
rule_files:
"swarm_node.rules.yml"
"swarm_task.rules.yml"
alerting:
alertmanagers:
static_configs:
targets:
alertmanager:9093
scrape_configs:
job_name: 'prometheus'
static_configs:
targets: ['localhost:9090']
- job_name: 'dockerd-exporter'
dns_sd_configs: - names:
- 'tasks.dockerd-exporter'
type: 'A'
port: 9323
job_name: 'cadvisor'
dns_sd_configs:
names:
'tasks.cadvisor'
type: 'A'
port: 8080
job_name: 'node-exporter'
dns_sd_configs:
names:
'tasks.node-exporter'
type: 'A'
port: 9100
job_name: 'grafana'
dns_sd_configs:
names:
'tasks.grafana'
type: 'A'
port: 3000
FROM prom/prometheus:v2.2.0-rc.0
COPY conf/ /etc/prometheus/
#ENTRYPOINT [ "/etc/prometheus/docker-entrypoint.sh" ]
CMD [ "--config.file=/etc/prometheus/prometheus.yml",
"--storage.tsdb.path=/prometheus",
"--web.console.libraries=/usr/share/prometheus/console_libraries",
"--web.console.templates=/usr/share/prometheus/consoles" ]`
This issue is resolved.
It was healthcheck service which was sending SIGTERM
@gauravgoyal0086 Can you please provide some more details about how you solved this problem?
I see the same sigterm message...
thanks in advance...
@gauravgoyal0086 Can you please provide some more details about how you solved this problem?
I see the same sigterm message...
Facing the same issue so interested in solution as well.
To tell you the truth, I didn't see how you solved it.
Facing the same problem!
Does anyone understand the cause?
Take a look at the healthcheck: section of Prometheus service work a working healthcheck for the prom container:
https://github.com/swarmstack/swarmstack/blob/master/docker-compose.yml
I had the same issue, updating the livenessprobe in the Kubernetes deployment config with GET /-/healthy
as the path resolved the issue.
YES, the path has /-/ in it 😄