Alerts to Grafana when drift detected: "error":"postMessage failed: failed to execute request: context deadline exceeded"
lgarciaharo opened this issue · comments
We are using tfcontroller and trying to send alerts to Grafana via Notification-controller from flux according to the documentation.
The configuration to make possible sending alerts with flux is already applied. Integrate with Flux Receivers and Alerts.
We have also a webhook provider and the Dritf detection works properly and the alert in that webhook is received. So, I suppose the event format emitted by tfcontroller is correct for flux.
On the Grafana side, the API Token is already stored as secret in the cluster and is tested. We can create annotations in grafana with curl to our https://<grafana-url>/api/annotations
endpoint.
However, the grafana provider and its alert aren't working.
We have enabled debug logs and in the notification-controller and we can see:
{
"alert": {
"name": "tf-alert-grafana",
"namespace": "ba999"
},
"error": "postMessage failed: failed to execute request: context deadline exceeded",
"eventInvolvedObject": {
"apiVersion": "infra.contrib.fluxcd.io/v1alpha2",
"kind": "Terraform",
"name": "my-tf",
"namespace": "ba999",
"resourceVersion": "13782667",
"uid": "8fd840ac-4bb5-4a02-8872-0df8f3e66d56"
},
"level": "error",
"logger": "event-server",
"msg": "failed to send notification",
"stacktrace": "github.com/fluxcd/notification-controller/internal/server.(*EventServer).handleEvent.func1.1\n\tgithub.com/fluxcd/notification-controller/internal/server/event_handlers.go:248",
"ts": "2023-09-27T16:02:30.646Z"
}
The versions are:
Flux | Argo CD | Image |
---|---|---|
v2.0.1 | v2.7 | v2.7.10 |
I've created an issue in tfcontroller too but as the event emitted by tfcontroller is working fine for webhook provider I'm not sure where is the problem if in the tfcontroller or in the flux side.
Any idea about what is happening?
Any help is welcome :)
Thanks!
Hello, Are you sure your Grafana API is reachable from within the cluster?
Hi @lgarciaharo ,
we had a similar issue today were the notification controller was not able to call Grafana at all. No HTTP packets were sent or received to Grafana when an event was processed and we also received the "error": "postMessage failed: failed to execute request: context deadline exceeded",
error message.
In our case the error was that we had a trailing line feed \n
at the end of our Grafana service token stored in our Kubernetes secret.
This happened because we encoded it with:
echo "$token" | base64
on the command line instead of using:
echo -n "$token" | base64
Maybe this helps you finding your problem.
Best regards,
Florian.
Thanks @fbuchmeier-abi I will try it next time I touch the code and let you know.