Readiness probe failing in pod: invalid literal for int() with base 10
surfer190 opened this issue · comments
The error I am getting when I describe the metrics pods is:
Using: oc describe pod hawkular-metrics-ks6sb -n openshift-infra
I get:
File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module>
if int(uptime) < int(timeout):
ValueError: invalid literal for int() with base 10: ''
entire Events
output:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1h default-scheduler Successfully assigned openshift-infra/hawkular-metrics-ks6sb to openshift.example.co.za
Normal Pulled 1h kubelet, openshift.example.co.za Container image "docker.io/openshift/origin-metrics-hawkular-metrics:v3.11.0" already present on machine
Normal Created 1h kubelet, openshift.example.co.za Created container
Normal Started 1h kubelet, openshift.example.co.za Started container
Warning Unhealthy 1h kubelet, openshift.example.co.za Liveness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>.
Traceback (most recent call last):
File "/opt/hawkular/scripts/hawkular-metrics-liveness.py", line 48, in <module>
if int(uptime) < int(timeout):
ValueError: invalid literal for int() with base 10: ''
Warning Unhealthy 1h (x3 over 1h) kubelet, openshift.example.co.za Readiness probe failed: Failed to access the status endpoint : <urlopen error [Errno 111] Connection refused>. This may be due to Hawkular Metrics not being ready yet. Will try again.
Warning Unhealthy 1h kubelet, openshift.example.co.za Readiness probe failed: Failed to access the status endpoint : timed out. This may be due to Hawkular Metrics not being ready yet. Will try again.
Warning Unhealthy 1h (x3 over 1h) kubelet, openshift.example.co.za Readiness probe failed: The MetricService is not yet in the STARTED state [STARTING]. We need to wait until its in the STARTED state.
I'm not sure which of the variables in hawkular-metrics-liveness.py
is an empty string but we should add a test case for both.
OC Version:
[root@openshift ~]# oc version
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://openshift.example.co.za:8443
openshift v3.11.0+bd0bee4-337
kubernetes v1.11.0+d4cacc0
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale
.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen
.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen
.
Mark the issue as fresh by commenting /remove-lifecycle rotten
.
Exclude this issue from closing again by commenting /lifecycle frozen
.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen
.
Mark the issue as fresh by commenting/remove-lifecycle rotten
.
Exclude this issue from closing again by commenting/lifecycle frozen
./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.