grafana / k6-operator

An operator for running distributed k6 tests.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Helm deploy failed to pass in custom configuration

zzhao2010 opened this issue · comments

Brief summary

I was testing to output k6 test metrics generated from each k6 executor pods to Prometheus via remote write. The remote write flag was enabled on Prometheus correctly as I see metrics reporting to Prometheus correctly when I port-forworded the Prometheus pod and triggered a test from my local. However, when I triggered the same test via k6-operator, I see error message on k6 pod "Failed to send the time series data to the endpoint" error="HTTP POST request failed: Post "http://prometheus-kube-prometheus-prometheus:9090/api/v1/write\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"

k6-operator version or image

Helm chart version (if applicable)


TestRun / PrivateLoadZone YAML

kind: K6
  name: demo
  parallelism: 1
  cleanup: post
  arguments: -o experimental-prometheus-rw --tag testid=demo_test
      name: demo
      file: test.js
        value: "http://prometheus-kube-prometheus-prometheus:9090/api/v1/write"
        value: "true"

Other environment details (if applicable)

minikube version: v1.32.0
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.3

Steps to reproduce the problem

1, Enable prometheus remote write with values.yaml below:

  enabled: true
    ## enable --web.enable-remote-write-receiver flag on prometheus-server
    enableRemoteWriteReceiver: true

    # EnableFeatures API enables access to Prometheus disabled features.
    # ref:
      - native-histograms

2, Apply the TestRun to the k8s cluster with a env variable K6_PROMETHEUS_RW_SERVER_URL using the service name of the Prometheus pod:9090/api/v1/write

Expected behaviour

The k6 metrics reporting to prometheus.

Actual behaviour

demo-1-rdg95 time="2024-02-02T07:13:30Z" level=error msg="Failed to send the time series data to the endpoint" error="HTTP POST request failed: Post \"http://prometheus-kube-prometheus-prometheus:9090/api/v1/write\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" output="Prometheus remote write"                                                    │
│ demo-1-rdg95 time="2024-02-02T07:13:30Z" level=warning msg="Successful flushed time series to remote write endpoint but it took 5.002130252s while flush period is 5s. Some samples may be dropped." nts=15 output="Prometheus remote write"

I tested the connection to Prometheus from another pod in the same cluster by curl -v -X POST http://prometheus-kube-prometheus-prometheus:9090/api/v1/write and the connection worked.

prometheus-grafana-9c98f646b-7h2mg:/usr/share/grafana$ curl -v -X POST http://prometheus-kube-prometheus-prometheus:9090/api/v1/write
* Host prometheus-kube-prometheus-prometheus:9090 was resolved.
* IPv6: (none)
* IPv4:
*   Trying
* Connected to prometheus-kube-prometheus-prometheus ( port 9090
> POST /api/v1/write HTTP/1.1
> Host: prometheus-kube-prometheus-prometheus:9090
> User-Agent: curl/8.5.0
> Accept: */*
< HTTP/1.1 400 Bad Request
< Content-Type: text/plain; charset=utf-8
< X-Content-Type-Options: nosniff
< Date: Fri, 02 Feb 2024 07:26:56 GMT
< Content-Length: 22
snappy: corrupt input

Hi @zzhao2010,

  enabled: true

Which chart is configured with these values? k6-operator's chart cannot be configured in this way. This looks like an issue with your setup rather than with k6-operator.

Given the error, one thing that is worth checking is the URL for Prometheus:

        value: "http://prometheus-kube-prometheus-prometheus:9090/api/v1/write"

This assumes that Prometheus is in default namespace: context timeout suggests that k6 runners cannot reach it with this URL - it may be incorrect or incomplete. I'd double-check that part. But overall, it seems like an issue with setup rather than a bug here.

Turned out the issue was with the url. Because the prometheus pod and k6 pod are hosted on different namespace, the endpoint needs to include the namespace like http://prometheus-kube-prometheus-prometheus..svc.cluster.local:9090/api/v1/write..

Another question regarding the prometheus.enabled value in the doc in k6-operator chart, what does it do? The description didn't explain it clear. Does it have to be enabled for metrics reporting to Prometheus correctly?

prometheus.enabled is for creating ServiceMonitor: that option is meant for users of Prometheus Operator.

Glad you resolved it. I'm closing this issue as it is not a bug of k6-operator. As future reference, it is recommended to raise inquiries regarding k6-operator in community forum.