lyft / flinkk8soperator

Kubernetes operator that provides control plane for managing Apache Flink applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prom Metrics not populating for controllers

krishnaarava opened this issue · comments

I am running Flink Operator version v0.4.0 and dont see any prom metrics populated at the /metrics endpoint for "FlinkApplication" controller. For example metrics for counters cache_hit, cache_miss, reconciler_error are missing at the /metrics endpoint. Is there any configmap configuration that I need to populate to see these counters?

cacheHit: labeled.NewCounter("cache_hit", "Flink application resource fetched from cache", reconcilerScope),

cacheMiss: labeled.NewCounter("cache_miss", "Flink application resource missing from cache", reconcilerScope),

reconcileError: labeled.NewCounter("reconcile_error", "Reconcile for application failed", reconcilerScope),

Did you follow the steps here - https://www.weave.works/docs/cloud/latest/tasks/monitor/configuration-k8s/ ?

Yes, we have scrapping configuration setup correctly. I was referring to the operator's pod endpoint http://localhost:8080/metrics that does not output controller's prom metrics like the counters I mentioned above.

prom.log

Attached is the prom metrics I am seeing at localhost:8080/metrics endpoint

Do you have a FlinkApplication currently running? I recall that a lot of metrics won't show up at the endpoint until you've got at least one flinkapp running.

@krishnaarava just ran into this issue as well and found that all the controller metrics are actually exposed on :10254/metrics for the operator pod. Hope that helps even though it's a couple of months late.

Thanks @HunterEl, can see the metrics now.