Not able to ingest metrics , target Labels are dropped
girishbin opened this issue · comments
What happened?
Description
I have created a serviceMonitor to scrape the traefik metrics. I have deployed the ServiceMonitor and in prometheus web URL I can see entries of traefik in Service Discovery, but in the target Label section it shows DROPPED.
In prometheus deployment kind serviceMonitorSelector: {} which means all the service monitors are selected.
Expected Result
Traefik metrics are not ingested into prometheus and all the labels are DROPPED.
Screenshot minimised.Actual Result
Traefik metrics should be ingested into prometheus.
Prometheus Operator Version
Name: prometheus-operator
Namespace: monitoring
CreationTimestamp: Mon, 15 Jan 2024 00:03:05 +0530
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/name=prometheus-operator
app.kubernetes.io/part-of=kube-prometheus
app.kubernetes.io/version=0.70.0
Annotations: deployment.kubernetes.io/revision: 1
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/name=prometheus-operator,app.kubernetes.io/part-of=kube-prometheus
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/name=prometheus-operator
app.kubernetes.io/part-of=kube-prometheus
app.kubernetes.io/version=0.70.0
Annotations: kubectl.kubernetes.io/default-container: prometheus-operator
Service Account: prometheus-operator
Containers:
prometheus-operator:
Image: quay.io/prometheus-operator/prometheus-operator:v0.70.0
Port: 8080/TCP
Host Port: 0/TCP
Args:
--kubelet-service=kube-system/kubelet
--prometheus-config-reloader=quay.io/prometheus-operator/prometheus-config-reloader:v0.70.0
Limits:
cpu: 200m
memory: 200Mi
Requests:
cpu: 100m
memory: 100Mi
Environment:
GOGC: 30
Mounts: <none>
kube-rbac-proxy:
Image: quay.io/brancz/kube-rbac-proxy:v0.15.0
Port: 8443/TCP
Host Port: 0/TCP
SeccompProfile: RuntimeDefault
Args:
--secure-listen-address=:8443
--tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_256
_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
--upstream=http://127.0.0.1:8080/
Limits:
cpu: 20m
memory: 40Mi
Requests:
cpu: 10m
memory: 20Mi
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: prometheus-operator-5598fd768c (1/1 replicas created)
Events: <none>
### Kubernetes Version
```yaml
kubectl version -o yaml
clientVersion:
buildDate: "2023-09-13T09:35:49Z"
compiler: gc
gitCommit: 89a4ea3e1e4ddd7f7572286090359983e0387b2f
gitTreeState: clean
gitVersion: v1.28.2
goVersion: go1.20.8
major: "1"
minor: "28"
platform: linux/arm64
kustomizeVersion: v5.0.4-0.20230601165947-6ce0bf390ce3
serverVersion:
buildDate: "2023-12-07T09:21:49Z"
compiler: gc
gitCommit: 4972474a92b08e5ffc04727134e9c1112959cc16
gitTreeState: clean
gitVersion: v1.26.11-gke.1055000
goVersion: go1.20.11 X:boringcrypto
major: "1"
minor: "26"
platform: linux/amd64
### Kubernetes Cluster Type
GKE
### How did you deploy Prometheus-Operator?
prometheus-operator/kube-prometheus
### Manifests
```yaml
ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: traefik-metrics
namespace: monitoring
labels:
app: traefik
spec:
jobLabel: traefik-metrics
selector:
matchLabels:
app: traefik-prometheus
namespaceSelector:
matchNames:
- traefik
endpoints:
- port: traefik-prometheus
path: /metrics
App Service
apiVersion: v1
kind: Service
metadata:
name: traefik-prometheus
labels:
app: traefik-prometheus
namespace: traefik
spec:
selector:
app.kubernetes.io/instance: traefik-traefik
app.kubernetes.io/name: traefik
ports:
- protocol: TCP
name: traefik-metrics
port: 9100
Listing of endpoints
kubectl get endpoints -n traefik -l app=traefik-prometheus
NAME ENDPOINTS AGE
traefik-prometheus 10.48.17.3:9100 4h30m
prometheus-operator log output
Operator logs not showing up any error
Anything else?
Earlier had the RBAC issue and was able to resolve it by adding permissions to list, get and watch endpoints as RBAC.
Its almost from 4 days I'm struggling to figure what the issue. Please need a help in fixing this issue.
The problem was in giving the correct name to endpoint port. It should be traefik-metrics instead of traefik-prometheus.
Also I tried giving the port number "9100" earlier and it was not working.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: traefik-metrics
namespace: monitoring
labels:
app: traefik
spec:
jobLabel: traefik-metrics
selector:
matchLabels:
app: traefik-prometheus
namespaceSelector:
matchNames:
- traefik
endpoints:
- port: traefik-prometheus # this should be traefik-metrics
path: /metrics
Issue is resolved. Thanks.