Jenkins infra monitoring
This repository is meant to hold configuration files of Jenkins infrastructure monitoring stuff.
DataDog
Jenkins infrastructure is monitored using DataDog.
-
Main DataDog organization: https://jenkins.datadoghq.com
-
Staging DataDog organization: https://jenkins-staging.datadoghq.com
Monitors
Add a monitor
-
For convenience, you can first create the monitor on DataDog’s side, and then use the
Export
feature (monitor settings) to have a JSON representing the monitor -
Describe the monitor in
datadog-monitors.tf
using HCL -
Submit a PR. At this point, your monitor should be created to DataDog staging organization
-
When PR will be merged to master branch, Terraform configuration will be applied on the main DataDog organization (after an approval input step)
Dashboard
The Jenkins infra project uses multiple private and public dashboard to have quick overviews of specific situation. While it’s possible to automate the configuration of those dashboards, we never had the time to implement them.
They are multiple ways to contribute to those dashboards:
-
If you think a specific dashboard would be usefull for the community, feel free to open a Jira ticket on issues.jenkins-ci.org with a complete description, why someone should implement it and don’t forget to set the component to 'datadog'.
-
If you have sparetime and want to practice or learn Terraform, feel free to look at our Jira board issues.jenkins-ci.org and then open a Pull Request on this repository with your changes, your PR will be tested on ci.jenkins.io
Public Dashboard
Downtime
You can mute DataDog monitors using their tags during a specific interval.
For example, if you want to mute monitors for Confluence, you can use service:confluence
tag to mute all related monitors at once.