vshn / signalilo

Forward alerts from Prometheus Alertmanager to Icinga2 via Webhooks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[misconfiguration?] severity labels do not seem to work correctly

yoshi314 opened this issue · comments

from signalilo log:

signalilo-7b4bb75878-vm84w signalilo {"level":"debug","msg":"Processing firing alert: alertname=apiservices_down, severity=CRITICAL, message=Apiservice v1beta1.webhook.cert-manager.io not operational : MissingEndpoints","time":"2021-03-08T13:54:03Z"}
signalilo-7b4bb75878-vm84w signalilo {"level":"info","msg":"updating service: apiservices_down_37c3e7be25078ec2\n","time":"2021-03-08T13:54:03Z"}
signalilo-7b4bb75878-vm84w signalilo {"level":"debug","msg":"Executing ProcessCheckResult on icinga2 for apiservices_down_37c3e7be25078ec2: exit status 3","time":"2021-03-08T13:54:03Z"}

i am always getting UNKNOWN, and i don't understand why.

Without using --alertmanager_custom_severity_levels, Signalilo only maps the all lower case versions of "critical", "warning" and "normal" to exitstatus 2 ("CRITICAL"), 1 ("WARNINIG"), and 0 ("OK") for the Icinga2 services respectively.

These default mappings are configured in config/config.go:

signalilo/config/config.go

Lines 90 to 105 in 5726f5c

// Create the default severity levels and then merge any custom ones into it.
// This keeps the defaults for backwards compatibility and allows both additions and overrides.
allLevels := map[string]int{
"normal": 0,
"warning": 1,
"critical": 2,
}
for k, v := range config.CustomSeverityLevels {
// Ensure the user set configuration values are valid otherwise default to UNKNOWN
l, err := strconv.ParseInt(v, 10, 32)
if err != nil || l < 0 || l > 3 {
l = 3
}
allLevels[k] = int(l)
}
config.MergedSeverityLevels = allLevels

To also map upper case values to the correct exit status, you'll want something like --alertmanager_custom_severity_levels CRITICAL=2 --alertmanager_custom_severity_levels WARNING=1 as extra arguments to Signalilo. If you want to specify the arguments as environment variable, separate key-value pairs with a literal newline in the environment variable's value (e.g. for a Signalilo K8s container spec):

env:
  - name: SIGNALILO_ALERTMANAGER_CUSTOM_SEVERITY_LEVELS
    value: |-
      CRITICAL=2
      WARNING=1

Any value provided for label severity which is not present as a key in config.MergedSeverityLevels will result in exitstatus 3 ("UNKNOWN").

It's odd because i had them work briefly, in uppercase. I'll reconfigure it, then.

@simu would it make sense to normalize (ToLower) levels before matching?

@simu would it make sense to normalize (ToLower) levels before matching?

Probably, yes. I also just noticed that the README states

Required labels:

  • severity: Must be one of WARNING or CRITICAL, or any values set via the --alertmanager_custom_severity_levels option.

which obviously doesn't match the current implementation.

If we do a ToLower on the --alertmanager_custom_severity_levels in config/config.go:

signalilo/config/config.go

Lines 90 to 105 in 5726f5c

// Create the default severity levels and then merge any custom ones into it.
// This keeps the defaults for backwards compatibility and allows both additions and overrides.
allLevels := map[string]int{
"normal": 0,
"warning": 1,
"critical": 2,
}
for k, v := range config.CustomSeverityLevels {
// Ensure the user set configuration values are valid otherwise default to UNKNOWN
l, err := strconv.ParseInt(v, 10, 32)
if err != nil || l < 0 || l > 3 {
l = 3
}
allLevels[k] = int(l)
}
config.MergedSeverityLevels = allLevels

For example changing allLevels[k] = int(l) to allLevels[strings.ToLower(k)] = int(l)

Then should we also do a ToLower on the severity value coming from AlertManager in webhook/icinga.go when it is compared to the configured levels?

func severityToExitStatus(status string, severity string, severityLevels map[string]int) int {
// default to "UNKNOWN"
exitstatus := 3
if status == "firing" {
var ok bool
exitstatus, ok = severityLevels[severity]
if !ok {
exitstatus = 3
}
} else if status == "resolved" {
// mark exit status as NORMAL when alert state is "resolved"
exitstatus = 0
}
return exitstatus
}

For example changing exitstatus, ok = severityLevels[severity] to exitstatus, ok = severityLevels[strings.ToLower(severity)]

And also updating the readme to explain this better than I did in #52