mongodb / mongodb-kubernetes-operator

MongoDB Community Kubernetes Operator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Readiness probe failed: panic: open /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json: no such file or directory

balait4 opened this issue · comments

I have the same issue as mentioned in the already closed bug #959

Others as well facing the issue and the fix was only working for 4.2.6.

I tried for the version 4.2 and 6.0 which is still having the below issue:

Readiness probe failed: panic: open /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json: no such file or directory goroutine 1 [running]: main.main() /workspace/cmd/readiness/main.go:217 +0x19a

My operator version is : 0.7.9
Issue with MongoDB version: 4.2 & 6.0 as well.

Any update on this please or any workaround already there for this? thanks

I also had this problem on 6.0.6 in operator version 0.8.0 and started following this issue.

The problem sorted itself out today when I made some fixes to my TLS certificate configuration. I don't know if this is the same problem you are facing but, if you are using TLS, the connection string used by the readiness probe requires the TLS certificates to be valid and match the name of the service.

My assumption is the agent-health-status.json file is not written to if the probe never connects to the service successfully in the first place.

thanks for reply, my case still I didn't enable/configure TLS. I build my own image version 4.2 and 6.0, with this I'm getting this issue. If I use docker.io/mongo:4.2.6 or 6.0.6 then it is working fine.

@balait4 had the same issue with mongodb ver. 6.0.5 when setting the number of members to either 1 or 2.. Setting it to 3 fixed the issue.

What's your ReplicaSet member number?

thanks Yeah even for me it is working. But any idea that if I update the MongoDBCommunity resource for any changes the operator is not straight way do the reconcile, i need to delete the statefulset then it will create teh stateful set with new configuration. Is this expected?

@balait4 I guess it depends on the changes you want to implement. E.g. for the ReplicaSet number, once you modify the manifest, save it and do kubectl apply, it should work.

But if you want to add something like certificates, you'd need to provision other resources and hence deleting and creating the Statefulset is the way to go.

Still it shouldn't be a problem as you keep the PVCs, so once your db is back everything should be back to normal.

Thanks for the reply!

This issue is being marked stale because it has been open for 60 days with no activity. Please comment if this issue is still affecting you. If there is no change, this issue will be closed in 30 days.

I am having the same issue with mongo:6.0.8. I changed the image version to mongo:6.0.6 problem resolved.

Could this be an issue with image mongo:6.0.8? Or operator issue?

I think the issue is with operator. when I completely deleted the replicates and redeploy it problem still occurs with mongo:6.0.6 as well.

when patched the deployment with new mongo image the everything is working fine.

trick is to change the mongodb version and apply the changes without deleting the deployment.

Hey, this relates to the readinessProbe used by the operator to define readiness of the pods.

It should be a red-herring and should be mostly fine since the readinessProbe eventually recovers.

Having said that, this has been fixed in newer versions (starting 1.0.15 IIRC). The operator sources the readinessProbe version from an environment variable as seen in this helm-chart: https://github.com/mongodb/helm-charts/blob/a9cd1a8945ab98dfdc6e1f99c169822a6dacd7ab/charts/community-operator/values.yaml#L67

PR: #1224

I am closing this issue.

If there are issues with the readinessProbe marking the pod as unready while it should be ready it is most likely not because of above reason but something else.