How to schedule replicas and persistent volume in different availability zones
sebinnsebastiann opened this issue · comments
What did you do to encounter the bug?
Steps to reproduce the behavior:
I have a AWS EKS cluster and I deployed MongoDB replica set using operator version 0.9.0. Replica set deployment file I used is given below,
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
name: mongodb-test
namespace: mongodb-test
spec:
members: 3
type: ReplicaSet
version: "7.0.0"
security:
authentication:
modes: ["SCRAM"]
users:
- name: mongouser
db: admin
passwordSecretRef: # a reference to the secret that will be used to generate the user's password
name: password
roles:
- name: clusterAdmin
db: admin
- name: userAdminAnyDatabase
db: admin
scramCredentialsSecretName: mongodb-scram
statefulSet:
spec:
template:
spec:
nodeSelector:
server: mongo
# resources can be specified by applying an override
# per container name.
containers:
- name: mongod
resources:
limits:
cpu: "0.3"
memory: 700M
requests:
cpu: "0.2"
memory: 500M
- name: mongodb-agent
resources:
limits:
cpu: "0.2"
memory: 500M
requests:
cpu: "0.1"
memory: 250M
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
resources:
storageClassName: sc1
requests:
storage: 10Gi
- metadata:
name: logs-volume
spec:
storageClassName: sc1
resources:
requests:
storage: 2Gi
Below is the command I used to deploy mongodb,
kubectl apply -f mongodb-kubernetes-operator/crd/mongodbcommunity.mongodb.com_mongodbcommunity.yaml
kubectl get crd/mongodbcommunity.mongodbcommunity.mongodb.com
kubectl apply -k mongodb-kubernetes-operator/rbac/ --namespace mongodb-test
kubectl create -f mongodb-kubernetes-operator/manager/manager.yaml --namespace mongodb-test
kubectl apply -f replicaset/rbac -n mongodb-test
kubectl apply -f replicaset/replica-set.yaml -n mongodb-test
What did you expect?
- I want mongodb replicas to schedule on different availability zones.
- Schedule the replica in same availability zone as replica's pv.
eg: replica1, replica1's persistent volume in eu-west-1a
replica2, replica2's persistent volume in eu-west-1b
replica, replica3's persistent volume in eu-west-1c
What happened instead?
All the replicas and all of it's pv scheduled in same availability zone.
Operator Information
- Operator Version: v0.9.0
- MongoDB Image used: quay.io/mongodb/mongodb-community-server:7.0.0-ubi8
Kubernetes Cluster Information
- Distribution: AWS EKS
- Version: v1.26.11-eks-8cb36c9
Hey, you can pass on spec.statefulSet.spec.template.spec.affinity.podAntiAffinity
in the mongodb resource, or even topologySpreadConstraints
.
But after a POD restart, The POD schedules on az1 but its persistent volume is in az2. Then it can cause a volume node affinity error. How can we tackle this issue?
kube-scheduler accounts for the fact that the volume is in a specific az and will schedule the pod in the correct az
If a pod is rescheduled, deleted and recreated, or an instance where the pod was running is terminated then a pod reuses an existing EBS volume there is still a chance that the pod will be scheduled in an AZ where the EBS volume doesn’t exist.
I'm getting volume node affinity error in above scenario. Kube-schedule won't be able to schedule the pod in the correct az where the volume exist.
This issue is being marked stale because it has been open for 60 days with no activity. Please comment if this issue is still affecting you. If there is no change, this issue will be closed in 30 days.
This issue was closed because it became stale and did not receive further updates. If the issue is still affecting you, please re-open it, or file a fresh Issue with updated information.