Operator fail to install Mongo 7.0.9
alinalex1392 opened this issue · comments
What did you do to encounter the bug?
Steps to reproduce the behavior:
- Apply the following CR of Mongodbcommunity in order to create a ReplicaSet with the version 7.0.9
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"mongodbcommunity.mongodb.com/v1","kind":"MongoDBCommunity","metadata":{"annotations":{},"name":"test-upgrade-downgrade-5","namespace":"mongodb"},"spec":{"additionalMongodConfig":{"storage.wiredTiger.engineConfig.journalCompressor":"zlib"},"featureCompatibilityVersion":"6.0","members":3,"security":{"authentication":{"modes":["SCRAM"]}},"statefulSet":{"spec":{"template":{"spec":{"imagePullSecrets":[{"name":"mongodb-pull-secret"}]}}}},"type":"ReplicaSet","users":[{"db":"admin","name":"my-user","passwordSecretRef":{"name":"my-user-password"},"roles":[{"db":"admin","name":"clusterAdmin"},{"db":"admin","name":"userAdminAnyDatabase"},{"db":"admin","name":"dbOwner"}],"scramCredentialsSecretName":"my-scram"}],"version":"7.0.9"}}
name: test-upgrade-downgrade-5
namespace: mongodb
spec:
additionalMongodConfig:
storage.wiredTiger.engineConfig.journalCompressor: zlib
featureCompatibilityVersion: "6.0"
members: 3
security:
authentication:
ignoreUnknownUsers: true
modes:
- SCRAM
statefulSet:
spec:
template:
spec:
imagePullSecrets:
- name: mongodb-pull-secret
type: ReplicaSet
users:
- db: admin
name: my-user
passwordSecretRef:
name: my-user-password
roles:
- db: admin
name: clusterAdmin
- db: admin
name: userAdminAnyDatabase
- db: admin
name: dbOwner
scramCredentialsSecretName: my-scram
version: 7.0.9
status:
currentMongoDBMembers: 0
currentStatefulSetReplicas: 0
message: ReplicaSet is not yet ready, retrying in 10 seconds
mongoUri: ""
phase: Pending
- The operator is reconciling, initializing the first replica but the following error is being encountered in the mongod container of the Mongo instance:
│ 2024-05-13T07:30:08.587Z INFO versionhook/main.go:32 Running version change post-start hook │ │
│ 2024-05-13T07:30:08.587Z INFO versionhook/main.go:39 Waiting for agent health status... │ │
│ 2024-05-13T07:30:09.587Z INFO versionhook/main.go:75 Pod should not be deleted, mongod started │ │
│ : 9: exec: mongod: not found │ │
│ Stream closed EOF for mongodb/test-upgrade-downgrade-5-0 (mongod)
What did you expect?
The operator should install Mongo 7.0.9 with success
What happened instead?
The Mongo Instance faling on install.
Operator Information
- Operator Version: mongodb/mongodb-kubernetes-operator:0.9.0
- MongoDB Image used: mongo:7.0.9
Kubernetes Cluster Information
- Distribution: k3s
- Version: v1.26.12+k3s1
- Image Registry location (quay, or an internal registry): internal registry
Additional context
The scratch install, upgrades are working from 7.0.6 to 7.0.8
If possible, please include:
- The operator logs
- Below we assume that your replicaset database pods are named
mongo-<>
. For instance:
❯ k get pods
kubectl get pod -n mongodb
NAME READY STATUS RESTARTS AGE
mongodb-kubernetes-operator-7c46f87988-5hvh2 1/1 Running 0 4d16h
test-upgrade-downgrade-5-0 0/2 Error 9 (5m20s ago) 22m
- yaml definitions of your MongoDB Deployment(s):
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
creationTimestamp: "2024-05-13T07:13:25Z"
generation: 1
name: test-upgrade-downgrade-5
namespace: mongodb
ownerReferences:
- apiVersion: mongodbcommunity.mongodb.com/v1
blockOwnerDeletion: true
controller: true
kind: MongoDBCommunity
name: test-upgrade-downgrade-5
uid: 978e0a7b-c639-414d-82e6-4250eaad15af
resourceVersion: "153128"
uid: aefba2e0-88f5-480a-b84b-8ed57fac5f90
spec:
podManagementPolicy: OrderedReady
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: test-upgrade-downgrade-5-svc
serviceName: test-upgrade-downgrade-5-svc
template:
metadata:
creationTimestamp: null
labels:
app: test-upgrade-downgrade-5-svc
spec:
containers:
- args:
- ""
command:
- /bin/sh
- -c
- |2+
#run post-start hook to handle version changes
/hooks/version-upgrade
# wait for config and keyfile to be created by the agent
while ! [ -f /data/automation-mongod.conf -a -f /var/lib/mongodb-mms-automation/authentication/keyfile ]; do sleep 3 ; done ; sleep 2 ;
# start mongod with this configuration
exec mongod -f /data/automation-mongod.conf;
env:
- name: AGENT_STATUS_FILEPATH
value: /healthstatus/agent-health-status.json
image: mongo:7.0.9
imagePullPolicy: IfNotPresent
name: mongod
resources:
limits:
cpu: "1"
memory: 500M
requests:
cpu: 500m
memory: 400M
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data-volume
- mountPath: /healthstatus
name: healthstatus
- mountPath: /hooks
name: hooks
- mountPath: /var/log/mongodb-mms-automation
name: logs-volume
- mountPath: /var/lib/mongodb-mms-automation/authentication
name: test-upgrade-downgrade-5-keyfile
- mountPath: /tmp
name: tmp
- command:
- /bin/bash
- -c
- |-
current_uid=$(id -u)
AGENT_API_KEY="$(cat /mongodb-automation/agent-api-key/agentApiKey)"
declare -r current_uid
if ! grep -q "${current_uid}" /etc/passwd ; then
sed -e "s/^mongodb:/builder:/" /etc/passwd > /tmp/passwd
echo "mongodb:x:$(id -u):$(id -g):,,,:/:/bin/bash" >> /tmp/passwd
export NSS_WRAPPER_PASSWD=/tmp/passwd
export LD_PRELOAD=libnss_wrapper.so
export NSS_WRAPPER_GROUP=/etc/group
fi
agent/mongodb-agent -healthCheckFilePath=/var/log/mongodb-mms-automation/healthstatus/agent-health-status.json -serveStatusPort=5000 -cluster=/var/lib/automation/config/cluster-config.json -skipMongoStart -noDaemonize -useLocalMongoDbTools -logFile ${AGENT_LOG_FILE} -maxLogFileDurationHrs ${AGENT_MAX_LOG_FILE_DURATION_HOURS} -logLevel ${AGENT_LOG_LEVEL}
env:
- name: AGENT_LOG_FILE
value: /var/log/mongodb-mms-automation/automation-agent.log
- name: AGENT_LOG_LEVEL
value: INFO
- name: AGENT_MAX_LOG_FILE_DURATION_HOURS
value: "24"
- name: AGENT_STATUS_FILEPATH
value: /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
- name: AUTOMATION_CONFIG_MAP
value: test-upgrade-downgrade-5-config
- name: HEADLESS_AGENT
value: "true"
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: 7/mongodb/mongodb-agent:107.0.0.8465-1
imagePullPolicy: Always
name: mongodb-agent
readinessProbe:
exec:
command:
- /opt/scripts/readinessprobe
failureThreshold: 40
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "1"
memory: 500M
requests:
cpu: 500m
memory: 400M
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/scripts
name: agent-scripts
- mountPath: /var/lib/automation/config
name: automation-config
readOnly: true
- mountPath: /data
name: data-volume
- mountPath: /var/log/mongodb-mms-automation/healthstatus
name: healthstatus
- mountPath: /var/log/mongodb-mms-automation
name: logs-volume
- mountPath: /var/lib/mongodb-mms-automation/authentication
name: test-upgrade-downgrade-5-keyfile
- mountPath: /tmp
name: tmp
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: mongodb-pull-secret
initContainers:
- command:
- cp
- version-upgrade-hook
- /hooks/version-upgrade
image: mongodb-kubernetes-operator-version-upgrade-post-start-hook:1.0.8
imagePullPolicy: Always
name: mongod-posthook
resources: {}
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /hooks
name: hooks
- command:
- cp
- /probes/readinessprobe
- /opt/scripts/readinessprobe
image: mongodb-kubernetes-readinessprobe:1.0.17
imagePullPolicy: Always
name: mongodb-agent-readinessprobe
resources: {}
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/scripts
name: agent-scripts
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 2000
serviceAccount: mongodb-database
serviceAccountName: mongodb-database
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: agent-scripts
- name: automation-config
secret:
defaultMode: 416
secretName: test-upgrade-downgrade-5-config
- emptyDir: {}
name: healthstatus
- emptyDir: {}
name: hooks
- emptyDir: {}
name: test-upgrade-downgrade-5-keyfile
- emptyDir: {}
name: tmp
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: data-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10G
volumeMode: Filesystem
status:
phase: Pending
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: logs-volume
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2G
volumeMode: Filesystem
status:
phase: Pending
status:
availableReplicas: 0
collisionCount: 0
currentReplicas: 1
currentRevision: test-upgrade-downgrade-5-7c6f9fd667
observedGeneration: 1
replicas: 1
updateRevision: test-upgrade-downgrade-5-7c6f9fd667
updatedReplicas: 1
- The agent clusterconfig of the faulty members:
kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/lib/automation/config/cluster-config.json
{"version":1,"processes":[{"name":"test-upgrade-downgrade-5-0","disabled":false,"hostname":"test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local","args2_6":{"net":{"port":27017},"replication":{"replSetName":"test-upgrade-downgrade-5"},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"6.0","processType":"mongod","version":"7.0.9","authSchemaVersion":5,"LogRotate":{"timeThresholdHrs":0,"sizeThresholdMB":0}},{"name":"test-upgrade-downgrade-5-1","disabled":false,"hostname":"test-upgrade-downgrade-5-1.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local","args2_6":{"net":{"port":27017},"replication":{"replSetName":"test-upgrade-downgrade-5"},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"6.0","processType":"mongod","version":"7.0.9","authSchemaVersion":5,"LogRotate":{"timeThresholdHrs":0,"sizeThresholdMB":0}},{"name":"test-upgrade-downgrade-5-2","disabled":false,"hostname":"test-upgrade-downgrade-5-2.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local","args2_6":{"net":{"port":27017},"replication":{"replSetName":"test-upgrade-downgrade-5"},"storage":{"dbPath":"/data","wiredTiger":{"engineConfig":{"journalCompressor":"zlib"}}}},"featureCompatibilityVersion":"6.0","processType":"mongod","version":"7.0.9","authSchemaVersion":5,"LogRotate":{"timeThresholdHrs":0,"sizeThresholdMB":0}}],"replicaSets":[{"_id":"test-upgrade-downgrade-5","members":[{"_id":0,"host":"test-upgrade-downgrade-5-0","arbiterOnly":false,"votes":1,"priority":1},{"_id":1,"host":"test-upgrade-downgrade-5-1","arbiterOnly":false,"votes":1,"priority":1},{"_id":2,"host":"test-upgrade-downgrade-5-2","arbiterOnly":false,"votes":1,"priority":1}],"protocolVersion":"1","numberArbiters":0}],"auth":{"usersWanted":[{"mechanisms":[],"roles":[{"role":"clusterAdmin","db":"admin"},{"role":"userAdminAnyDatabase","db":"admin"},{"role":"dbOwner","db":"admin"}],"user":"my-user","db":"admin","authenticationRestrictions":[],"scramSha256Creds":{"iterationCount":15000,"salt":"/jLBXRIwZ6aIokJdNVd6Pt4KO0/oCybdMuy8iw==","serverKey":"qIllLCzn87xNPTkcxcAI5/M8cxFQjpMaqER5ePB8vb4=","storedKey":"3rjFn7Kl/hyX4EOTYEmavMfPMyTdhJ108Q4wFmZz//M="},"scramSha1Creds":{"iterationCount":10000,"salt":"MZb8l+Hz+WXy/qtRhv8LcA==","serverKey":"JKMc7AtdsRcD/Upn2vVq+h0Bv/4=","storedKey":"HiRXHIyTwPFH7zNtUiJbi0QHpNM="}}],"disabled":false,"authoritativeSet":false,"autoAuthMechanisms":["SCRAM-SHA-256"],"autoAuthMechanism":"SCRAM-SHA-256","deploymentAuthMechanisms":["SCRAM-SHA-256"],"autoUser":"mms-automation","key":"sQMEmMJZSLthtDxn/LYbbNIkLxeKFHkEDufbPVXaZ83NRUuPBXVI1H78wJp/tMr0vz8QH/+xhDFSPrxARzG1x8Mf0xIIvUXwqpjwq++v52S/SQWljU3lVj11P6BAYhbVyWo9kjnned8qMm4tNtxo3A5la4/VhHa5M18ZcJW7gWFZ52dST3StQWFxmqJkzp0rTwqzEROtKUA+ml40yXaGhn/7kkQNYK0TcH7u/STVNjxeWwESoAXH0DIyHw+DMIyrKtc4tKBmAIVsBPgw+mxKF/fFTXQ58D+kerG4SWew/ddH5DmFO7K1IEoPxN82U1oC9bkC18PF5dZ/djcBmNbuFDB2xhnWkRkV1Jwx78PXBody+tPB1zFZLaiSi+z6qc7cv043UOU1xffRsz1jq2CxeD6b9jTacwd4ohBlWewliMTlNv2DAZkcpxiXfUeflHDLkcrZPINgzhy42hqvAZE84IsvX00qwQ5r8osvV+6z1oubBIW9OYAhGTVjJKFbZc+NlwE6gEOEh1FnL4YTWlav010oBxr9NvanbBTIPRLzkPjTb+cAf6Ifmib6YlofwAWUIBlUYkwo57phmLAxkmV22sEZynEjVfTGq5opJ+hgU8kXsGhe3DJL1nWsxIIvBzk9lncdP/YxCPtXQC2/yuKmrfVoAMA=","keyfile":"/var/lib/mongodb-mms-automation/authentication/keyfile","keyfileWindows":"%SystemDrive%\\MMSAutomation\\versions\\keyfile","autoPwd":"yV8d3bE_2SK-aNLm7IBP"},"tls":{"CAFilePath":"","clientCertificateMode":"OPTIONAL"},"mongoDbVersions":[{"name":"7.0.9","builds":[{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"amd64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"aarch64","flavor":"ubuntu","minOsVersion":"","maxOsVersion":"","modules":[]},{"platform":"linux","url":"","gitVersion":"","architecture":"aarch64","flavor":"rhel","minOsVersion":"","maxOsVersion":"","modules":[]}]}],"backupVersions":[],"monitoringVersions":[],"options":{"downloadBase":"/var/lib/mongodb-mms-automation"}}
- The agent health status of the faulty members:
kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/healthstatus/agent-health-status.json
{"statuses":{"test-upgrade-downgrade-5-0":{"IsInGoalState":false,"LastMongoUpTime":0,"ExpectedToBeUp":true,"ReplicationStatus":-1}},"mmsStatus":{"test-upgrade-downgrade-5-0":{"name":"test-upgrade-downgrade-5-0","lastGoalVersionAchieved":-1,"plans":[{"automationConfigVersion":1,"started":"2024-05-13T07:13:48.991315072Z","completed":null,"moves":[{"move":"Start","moveDoc":"Start the process","steps":[{"step":"StartFresh","stepDoc":"Start a mongo instance (start fresh)","isWaitStep":false,"started":"2024-05-13T07:13:48.991342226Z","completed":null,"result":""}]},{"move":"WaitAllRsMembersUp","moveDoc":"Wait until all members of this process' repl set are up","steps":[{"step":"WaitAllRsMembersUp","stepDoc":"Wait until all members of this process' repl set are up","isWaitStep":true,"started":null,"completed":null,"result":""}]},{"move":"RsInit","moveDoc":"Initialize a replica set including the current MongoDB process","steps":[{"step":"RsInit","stepDoc":"Initialize a replica set","isWaitStep":false,"started":null,"completed":null,"result":""}]},{"move":"WaitFeatureCompatibilityVersionCorrect","moveDoc":"Wait for featureCompatibilityVersion to be right","steps":[{"step":"WaitFeatureCompatibilityVersionCorrect","stepDoc":"Wait for featureCompatibilityVersion to be right","isWaitStep":true,"started":null,"completed":null,"result":""}]}]}],"errorCode":0,"errorString":""}}}
- The verbose agent logs of the faulty members:
kubectl exec -it mongo-0 -c mongodb-agent -- cat /var/log/mongodb-mms-automation/automation-agent-verbose.log
[2024-05-13T07:41:37.194+0000] [.warn] [src/mongoclientservice/mongoclientservice.go:createClient:1238] [07:41:37.194] Server for client=0x0 connParams=test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) determined to be down due to error=[07:41:37.194] Non-TLS attempt failed : dial tcp 10.42.1.235:27017: connect: connection refused
[2024-05-13T07:41:37.264+0000] [.warn] [src/mongoclientservice/mongoclientservice.go:createClient:1238] [07:41:37.264] Server for client=0x0 connParams=test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) determined to be down due to error=[07:41:37.264] Non-TLS attempt failed : dial tcp 10.42.1.235:27017: connect: connection refused
[2024-05-13T07:41:37.265+0000] [.warn] [metrics/collector/util.go:getPingStatus:127] <hardwareMetricsCollector> [07:41:37.265] Failed to fetch replStatus for test-upgrade-downgrade-5-0 : <hardwareMetricsCollector> [07:41:37.265] Server at test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) is down
[2024-05-13T07:41:37.326+0000] [.warn] [src/mongoclientservice/mongoclientservice.go:createClient:1238] [07:41:37.326] Server for client=0x0 connParams=test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) determined to be down due to error=[07:41:37.326] Non-TLS attempt failed : dial tcp 10.42.1.235:27017: connect: connection refused
[2024-05-13T07:41:37.457+0000] [.warn] [src/mongoclientservice/mongoclientservice.go:createClient:1238] [07:41:37.457] Server for client=0x0 connParams=test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) determined to be down due to error=[07:41:37.457] Non-TLS attempt failed : dial tcp 10.42.1.235:27017: connect: connection refused
[2024-05-13T07:41:37.589+0000] [.warn] [src/mongoclientservice/mongoclientservice.go:createClient:1238] [07:41:37.589] Server for client=0x0 connParams=test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) determined to be down due to error=[07:41:37.589] Non-TLS attempt failed : dial tcp 10.42.1.235:27017: connect: connection refused
[2024-05-13T07:41:37.721+0000] [.warn] [src/mongoclientservice/mongoclientservice.go:createClient:1238] [07:41:37.721] Server for client=0x0 connParams=test-upgrade-downgrade-5-0.test-upgrade-downgrade-5-svc.mongodb.svc.cluster.local:27017 (local=false) determined to be down due to error=[07:41:37.721] Non-TLS attempt failed : dial tcp 10.42.1.235:27017: connect: connection refused
The cause seems to be the mongo image from our private repository so this is not a bug.