cockroachdb / helm-charts

Helm charts for cockroachdb

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Upgrading to Helm chart 11.0.x removes SecurityContext

Chili-Man opened this issue · comments

We'd like to upgrade to the current 11.0.3 helm chart from 10.0.8, but this change in particular is causing the SecurityContext from the statefuleset (and other resources) to get removed.

I did a helm diff against what we currently have deployed and the new helm chart, where you can see that the security context gets removed:

cockroachdb, cockroachdb-r0, StatefulSet (apps) has changed:
  # Source: cockroachdb/charts/cockroachdb/templates/statefulset.yaml
  kind: StatefulSet
  apiVersion: apps/v1
  metadata:
    name: cockroachdb-r0
    namespace: "cockroachdb"
    labels:
-     helm.sh/chart: cockroachdb-10.0.8
+     helm.sh/chart: cockroachdb-11.0.3
      app.kubernetes.io/name: cockroachdb
      app.kubernetes.io/instance: "cockroachdb-r0"
      app.kubernetes.io/managed-by: "Helm"
      app.kubernetes.io/component: cockroachdb
      core.noteable.io/entity: noteable
+     tags.datadoghq.com/service: cockroachdb
+     tags.datadoghq.com/version: v22.10.4
  spec:
    serviceName: cockroachdb-r0
    replicas: 3
    updateStrategy:
      type: RollingUpdate
    podManagementPolicy: "Parallel"
    selector:
      matchLabels:
        app.kubernetes.io/name: cockroachdb
        app.kubernetes.io/instance: "cockroachdb-r0"
        app.kubernetes.io/component: cockroachdb
    template:
      metadata:
        labels:
          app.kubernetes.io/name: cockroachdb
          app.kubernetes.io/instance: "cockroachdb-r0"
          app.kubernetes.io/component: cockroachdb
          core.noteable.io/entity: noteable
+         tags.datadoghq.com/service: cockroachdb
+         tags.datadoghq.com/version: v22.10.4
        annotations:
          ad.datadoghq.com/db.check_names: |
            ["cockroachdb"]
          ad.datadoghq.com/db.init_configs: |
            [{}]
          ad.datadoghq.com/db.instances: |
            [
              {
                "prometheus_url": "https://%%host%%:8080/_status/vars",
                "tls_verify": false,
                "tls_ignore_warning": true
              }
            ]
      spec:
        serviceAccountName: cockroachdb-r0
        initContainers:
          - name: copy-certs
            image: "busybox"
            imagePullPolicy: "IfNotPresent"
            command:
              - /bin/sh
              - -c
              - "cp -f /certs/* /cockroach-certs/; chmod 0400 /cockroach-certs/*.key"
            env:
              - name: POD_NAMESPACE
                valueFrom:
                  fieldRef:
                    fieldPath: metadata.namespace
            volumeMounts:
              - name: certs
                mountPath: /cockroach-certs/
              - name: certs-secret
                mountPath: /certs/
        affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: core.noteable.io/node-role
                  operator: In
                  values:
                  - datastore
          podAntiAffinity:
            preferredDuringSchedulingIgnoredDuringExecution:
              - weight: 100
                podAffinityTerm:
                  topologyKey: topology.kubernetes.io/zone
                  labelSelector:
                    matchLabels:
                      app.kubernetes.io/name: cockroachdb
                      app.kubernetes.io/instance: "cockroachdb-r0"
                      app.kubernetes.io/component: cockroachdb
        topologySpreadConstraints:
        - labelSelector:
            matchLabels:
              app.kubernetes.io/name: cockroachdb
              app.kubernetes.io/instance: "cockroachdb-r0"
              app.kubernetes.io/component: cockroachdb
          maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: ScheduleAnyway
        tolerations:
          - effect: NoSchedule
            key: datastore
            operator: Exists
        # No pre-stop hook is required, a SIGTERM plus some time is all that's
        # needed for graceful shutdown of a node.
        terminationGracePeriodSeconds: 60
        containers:
          - name: db
-           image: "cockroachdb/cockroach:v22.2.8"
+           image: "cockroachdb/cockroach:v23.1.4"
            imagePullPolicy: "Always"
            args:
              - shell
              - -ecx
              # The use of qualified `hostname -f` is crucial:
              # Other nodes aren't able to look up the unqualified hostname.
              #
              # `--join` CLI flag is hardcoded to exactly 3 Pods, because:
              # 1. Having `--join` value depending on `statefulset.replicas`
              #    will trigger undesired restart of existing Pods when
              #    StatefulSet is scaled up/down. We want to scale without
              #    restarting existing Pods.
              # 2. At least one Pod in `--join` is enough to successfully
              #    join CockroachDB cluster and gossip with all other existing
              #    Pods, even if there are 3 or more Pods.
              # 3. It's harmless for `--join` to have 3 Pods even for 1-Pod
              #    clusters, while it gives us opportunity to scale up even if
              #    some Pods of existing cluster are down (for whatever reason).
              # See details explained here:
              # https://github.com/helm/charts/pull/18993#issuecomment-558795102
              - >-
                exec /cockroach/cockroach
                start --join=${STATEFULSET_NAME}-0.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-1.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-2.${STATEFULSET_FQDN}:26257
                --cluster-name=noteable
                --advertise-host=$(hostname).${STATEFULSET_FQDN}
                --certs-dir=/cockroach/cockroach-certs/
                --http-port=8080
                --port=26257
                --cache=25%
                --max-sql-memory=25%
                --logtostderr=INFO
            env:
              - name: STATEFULSET_NAME
                value: cockroachdb-r0
              - name: STATEFULSET_FQDN
                value: cockroachdb-r0.cockroachdb.svc.cluster.local
              - name: COCKROACH_CHANNEL
                value: kubernetes-helm
            ports:
              - name: grpc
                containerPort: 26257
                protocol: TCP
              - name: http
                containerPort: 8080
                protocol: TCP
            volumeMounts:
              - name: datadir
                mountPath: /cockroach/cockroach-data/
              - name: certs
                mountPath: /cockroach/cockroach-certs/
              - name: certs-secret
                mountPath: /cockroach/certs/
            livenessProbe:
              
              httpGet:
                path: /health
                port: http
                scheme: HTTPS
              initialDelaySeconds: 300
              periodSeconds: 5
            readinessProbe:
              httpGet:
                path: /health?ready=1
                port: http
                scheme: HTTPS
              initialDelaySeconds: 10
              periodSeconds: 5
              failureThreshold: 2
-           securityContext:
-             allowPrivilegeEscalation: false
-             capabilities:
-               drop:
-                 - ALL
-             privileged: false
-             readOnlyRootFilesystem: true
            resources:
              limits:
                memory: 24Gi
              requests:
                cpu: 2500m
                memory: 24Gi
        volumes:
          - name: datadir
            persistentVolumeClaim:
              claimName: datadir
          - name: certs
            emptyDir: {}
          - name: certs-secret
            projected:
              sources:
              - secret:
                  name: cockroachdb-node
                  items:
                  - key: ca.crt
                    path: ca.crt
                    mode: 256
                  - key: tls.crt
                    path: node.crt
                    mode: 256
                  - key: tls.key
                    path: node.key
                    mode: 256
-       securityContext:
-         fsGroup: 1000
-         runAsGroup: 1000
-         runAsUser: 1000
-         runAsNonRoot: true
    volumeClaimTemplates:
      - metadata:
          name: datadir
          labels:
            app.kubernetes.io/name: cockroachdb
            app.kubernetes.io/instance: "cockroachdb-r0"
            core.noteable.io/entity: noteable
+           tags.datadoghq.com/service: cockroachdb
+           tags.datadoghq.com/version: v22.10.4
        spec:
          accessModes: ["ReadWriteOnce"]
          storageClassName: "ebs-csi-gp3-encrypted-retain"
          resources:
            requests:
              storage: "512Gi"

I believe the problem was introduced with this change here cb2db9f

Doing a diff with the last 10.x version (10.0.9) does not yield these results, since it doesn't contain that aforementioned change.

@prafull01 can you triage this

Hey guys, this is the PR to fix this problem: #328

Closed with #328