crate / cratedb-prometheus-adapter

CrateDB Prometheus Adapter.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to POST/GET data from CrateDB: Croaks with err="context deadline exceeded"

RyanW8 opened this issue · comments

CrateDB-adapter logs in kube are being spammed with the below:

time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"

CrateDB performance is good and not the issue, it seems that after restarting the cratedb-adapter it works perfect for 30 seconds or so.

Dear Ryan,

thanks for your report here and at [0], and apologies for the very late reply.

context deadline exceeded is a very generic error raised from Go which usually indicates that the connection timed out, or that some other networking issue is present, like one communication partner is trying to negotiate a TLS connection while the other one isn't prepared for that.

Can you share some more details about your version of CrateDB and the load situation?

With kind regards,
Andreas.

[0] #33
[1] prometheus/prometheus#1438
[2] sensu/sensu-go#3792
[3] https://stackoverflow.com/questions/49817558/context-deadline-exceeded-prometheus

Hi again,

It seems that after restarting the cratedb-adapter it works perfect for 30 seconds or so.

On this, crate/crate#10779 also comes to mind. In this context, may I ask whether you are running CrateDB and Prometheus within a typical cloud environment or, otherwise, how specifically the cratedb-prometheus-adapter is connected to CrateDB, network-wise?

With kind regards,
Andreas.

Dear Ryan,

#44 improves the network behaviour slightly by adjusting the TCP timeout and keepalive settings. Now, those default values are used:

  • TCP keepalive interval: 30 seconds
  • TCP connect timeout: 10 seconds

The new -tcp.connect.timeout command line option can be used to adjust the latter parameter.

With kind regards,
Andreas.

P.S.: We just released version 0.4.0, which is available in form of release archives [1] and a Docker image [2].

[1] https://cdn.crate.io/downloads/dist/prometheus/
[2] https://ghcr.io/crate/cratedb-prometheus-adapter

Dear Ryan,

did you have a chance to validate if the behavior has been improved on your end with a more recent version? Otherwise, do you mind if I will close this issue? Please let me know if you need further assistance, or if this problem persists even with more recent versions.

With kind regards,
Andreas.

Dear Ryan,

we are just adding a patch which aims to improve the situation.

With kind regards,
Andreas.

Hi again,

the most recent release version 0.5.0 fixed this flaw. Please let us know if you still observe problems, or if you see improved behavior. Please also signal re-open if you believe the problem has not been fixed, yet.

With kind regards,
Andreas.