cortexproject / cortex

A horizontally scalable, highly available, multi-tenant, long term Prometheus.

Home Page:https://cortexmetrics.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Alertmanager template changes are not fully reloaded

locmai opened this issue · comments

Describe the bug
With a Cortex helm chart in our Kubernetes cluster - and a sidecar in the alertmanager pod to continuously check the changes from configmap then synchronize the templates in to our /data/fake/templates directory, the template files are updated but the changes are not fully reflected in the messages.

To Reproduce
Steps to reproduce the behavior (note: fake is the dummy tenant name):

  1. Start a minimal Cortex with alertmanager and sidecar
  2. Define a template file example.gotmpl (example in additional context)
  3. Let the alertmanager reload the configuration (log message:
    level.Info(am.logger).Log("msg", "synchronizing alertmanager configs for users")
    )
  4. Then change the template in the configmap in the changeme part
  5. Let the alertmanager reload the configuration again (same log as step 3)
  6. Check the directory `/data/fake/templates/example.gotmpl' - the file will have the changes
  7. Use amtool to simulate an alert

Expected behavior

The changes would be reflected in the simulated alert sent by the amtool

Actual behavior

The old template is still being used

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: Helm

Additional Context

Alertmanager configuration:

receivers:
  - name: 'team-1'
    slack_configs:
      - channel: '#team1'
        send_resolved: true
        title: '{{ template "__alert_title" . }}'
        text: |-
          Title :{{ template "__alert_title" . }}
templates:
  - 'example.gotmpl'

example.gotmpl :

{{ define "__alert_title" -}}
   {{ .CommonLabels.alertname }} - changeme
{{- end }}

We tried calling the /api/v1/alerts endpoint which gives us the updated template, and the log message indicates that the loadAndSyncConfigs is actually ran.

I've traced through the function from loadAndSyncConfigs -> setConfig where:
https://github.com/cortexproject/cortex/blob/9bc04ce3930b045480d72ab9712d3271c70c02ee/pkg/alertmanager/multitenant.go#L861C3-L861C68

this line seems to compare the templates from the loaded/updated cfg with the template in the store (with the templateFilePath) which is already updated via the sidecar's mechanism.

Need to take a look and see if we can reproduce this issue.