Readiness probe failing in pod: invalid literal for int() with base 10

Question

Readiness probe failing in pod: invalid literal for int() with base 10

surfer190 opened this issue 5 years ago · comments

The error I am getting when I describe the metrics pods is:

Using: oc describe pod hawkular-metrics-ks6sb -n openshift-infra

I get:

File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module>
        if int(uptime) < int(timeout):
    ValueError: invalid literal for int() with base 10: ''

entire Events output:

    Type     Reason     Age   From                              Message
    ----     ------     ----  ----                              -------
    Normal   Scheduled  1h    default-scheduler                 Successfully assigned openshift-infra/hawkular-metrics-ks6sb to openshift.example.co.za
    Normal   Pulled     1h    kubelet, openshift.example.co.za  Container image "docker.io/openshift/origin-metrics-hawkular-metrics:v3.11.0" already present on machine
    Normal   Created    1h    kubelet, openshift.example.co.za  Created container
    Normal   Started    1h    kubelet, openshift.example.co.za  Started container
    Warning  Unhealthy  1h    kubelet, openshift.example.co.za  Liveness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>.
    Traceback (most recent call last):
    File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module>
        if int(uptime) < int(timeout):
    ValueError: invalid literal for int() with base 10: ''
    Warning  Unhealthy  1h (x3 over 1h)  kubelet, openshift.example.co.za  Readiness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. This may be due to Hawkular Metrics not being ready yet. Will try again.
    Warning  Unhealthy  1h               kubelet, openshift.example.co.za  Readiness probe failed: Failed to access the status endpoint : timed out. This may be due to Hawkular Metrics not being ready yet. Will try again.
    Warning  Unhealthy  1h (x3 over 1h)  kubelet, openshift.example.co.za  Readiness probe failed: The MetricService is not yet in the STARTED state [STARTING]. We need to wait until its in the STARTED state.

I'm not sure which of the variables in hawkular-metrics-liveness.py is an empty string but we should add a test case for both.

OC Version:

[root@openshift ~]# oc version
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://openshift.example.co.za:8443
openshift v3.11.0+bd0bee4-337
kubernetes v1.11.0+d4cacc0

OpenShift Bot · Answer 1 · Sun Sep 20 2020 13:04:32 GMT+0800 (China Standard Time)

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

OpenShift Bot · Answer 2 · Thu Oct 22 2020 03:00:37 GMT+0800 (China Standard Time)

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

OpenShift Bot · Answer 3 · Sat Nov 21 2020 04:49:49 GMT+0800 (China Standard Time)

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

OpenShift CI Robot · Answer 4 · Sat Nov 21 2020 04:50:06 GMT+0800 (China Standard Time)

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.