Remote Write enablement
07Rajat opened this issue · comments
What happened?
Description
We are looking for a solution on remote write feature enablement.
In our case, We have multiple openshift clusters and we are trying to centralize these under one grafana dashboard.
In above image we could see there are 2 clusters, cluster 1 and cluster2 where we have prometheus installed in different namespaces. customized prometheus-operator installed in one namespace and another comes default with openshift itself and which is present under openshift-monitoring namespace.
Here, we are trying to remote_write the date from default prometheus from openshift-monitoring to customized promethues server.
In customized prometheus, promethus installed via prometheus as a separate prometheus object and exposing the prometheus service.
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
podMonitorSelector: {}
resources:
requests:
memory: 400Mi
apiVersion: v1
kind: Service
metadata:
name: prometheus
spec:
type: NodePort
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
selector:
prometheus: prometheus
https://blog.container-solutions.com/prometheus-operator-beginners-guide
https://grafana.com/blog/2023/01/19/how-to-monitor-kubernetes-clusters-with-the-prometheus-operator/
Here, we are trying to customize the prometheus yaml configuration however it is not allowing us to change or modify anything in the statefulset which generates post deployment of prometheus.
we are looking for an option where we could add the remote write configuration as configmap and mount that as volume in customize prometheus configuration.
Configmap for reference :
kind: ConfigMap
apiVersion: v1
metadata:
name: cluster-monitoring-config
namespace: test
labels:
hive.openshift.io/managed: 'true'
data:
config.yaml: |
enableUserWorkload: true
prometheusK8s:
remoteWrite:
- url: https://thanos-querier.openshift-monitoring.svc.cluster.local:9091/api/v1/write
oauth2:
clientId:
secret:
key: client-id
name: observatorium-credentials
clientSecret:
key: client-secret
name: observatorium-credentials
tokenUrl: https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token
remoteTimeout: 30s
writeRelabelConfigs:
- sourceLabels:
- __name__
action: keep
regex: (addon_operator_addons_count|addon_operator_reconcile_error|addon_operator_addon_health_info|addon_operator_ocm_api_requests_durations|addon_operator_ocm_api_requests_durations_sum|addon_operator_ocm_api_requests_durations_count|addon_operator_paused|cluster_admin_enabled|limited_support_enabled|identity_provider|cpms_enabled|ingress_canary_route_reachable|ocm_agent_service_log_sent_total|sre:slo:probe_success_api|sre:slo:probe_success_console|sre:slo:upgradeoperator_upgrade_result|sre:slo:imageregistry_http_requests_total|sre:slo:oauth_server_requests_total|sre:sla:outage_5_minutes|sre:slo:apiserver_28d_slo|sre:slo:console_28d_slo|sre:error_budget_burn:apiserver_28d_slo|sre:error_budget_burn:console_28d_slo|sre:operators:succeeded)
queueConfig:
capacity: 2500
maxShards: 1000
minShards: 1
maxSamplesPerSend: 2000
batchSendDeadline: 60s
minBackoff: 30ms
maxBackoff: 1m
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
retention: 11d
retentionSize: 90GB
volumeClaimTemplate:
metadata:
name: prometheus-data
spec:
resources:
requests:
storage: 100Gi
alertmanagerMain:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
volumeClaimTemplate:
metadata:
name: alertmanager-data
spec:
resources:
requests:
storage: 10Gi
telemeterClient:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
telemeterServerURL: https://infogw.api.openshift.com
prometheusOperator:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
grafana:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
k8sPrometheusAdapter:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
kubeStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
openshiftStateMetrics:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
thanosQuerier:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
monitoringPlugin:
nodeSelector:
node-role.kubernetes.io/infra: ''
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/infra
operator: Exists
Really appreciated your suggestions and support.
Prometheus Operator Version
openshiftVersion: 4.13.29
kustomizeVersion: v4.5.4
Kubernetes Version
openshiftVersion: 4.13.29
kustomizeVersion: v4.5.4
Kubernetes Cluster Type
OpenShift
How did you deploy Prometheus-Operator?
Other (please comment)
Manifests
No response
prometheus-operator log output
Prometheus Operator 0.56.3 provided by Craig Trought
Anything else?
No response
If you are running Prometheus-Operator, you can specify the remote write config in the corresponding Promtheus CR itself. See here. Unless you have a very specific reason to use the config map, maybe this will help?
In case you do want to use the config map, something like additionalScrapeConfigs
lets you write your own config (incase some fields are not yet supported) in a secret and reference it in the Prometheus CR.
I'm not sure to understand your issue. The right way to configure the OCP Prometheus is via the CMO configmap though I'm not sure why you have https://thanos-querier.openshift-monitoring.svc.cluster.local:9091/api/v1/write
as the remote-write endpoint.
Hi @mviswanathsai , @simonpasquier
For prometheus deployment
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus-operator
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
podMonitorSelector: {}
resources:
requests:
memory: 400Mi
And to add on this for remote_write url
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus-operator
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
podMonitorSelector: {}
resources:
requests:
memory: 400Mi
remoteWrite:
- url: https://prometheus-k8s.openshift-monitoring.svc.cluster.local:9091/api/v1/write
oauth2:
clientId:
secret:
key: client-id
name: observatorium-credentials
clientSecret:
key: client-secret
name: observatorium-credentials
tokenUrl: https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token
remoteTimeout: 30s
writeRelabelConfigs:
- sourceLabels:
- __name__
action: keep
regex: (addon_operator_addons_count|addon_operator_reconcile_error|addon_operator_addon_health_info|addon_operator_ocm_api_requests_durations|addon_operator_ocm_api_requests_durations_sum|addon_operator_ocm_api_requests_durations_count|addon_operator_paused|cluster_admin_enabled|limited_support_enabled|identity_provider|cpms_enabled|ingress_canary_route_reachable|ocm_agent_service_log_sent_total|sre:slo:probe_success_api|sre:slo:probe_success_console|sre:slo:upgradeoperator_upgrade_result|sre:slo:imageregistry_http_requests_total|sre:slo:oauth_server_requests_total|sre:sla:outage_5_minutes|sre:slo:apiserver_28d_slo|sre:slo:console_28d_slo|sre:error_budget_burn:apiserver_28d_slo|sre:error_budget_burn:console_28d_slo|sre:operators:succeeded)
seems prometheus object is restricting to create the statefulset with remote_write option, could you please suggest
seems prometheus object is restricting to create the statefulset with remote_write option, could you please suggest
There's no such restriction. I would check the status field of the Prometheus object and the prometheus-operator logs.
Prometheus Operator 0.56.3
This is a very old version. I'd advise to upgrade.
Prometheus Operator 0.56.3
This is a very old version. I'd advise to upgrade.
Thanks for the advice @simonpasquier but I believe this is the latest prometheus operator version which is available in the Redhat Marketplace