cert-manager / csi-driver

A Kubernetes CSI plugin to automatically mount signed certificates to Pods using ephemeral volumes

Home Page:https://cert-manager.io/docs/usage/csi-driver/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TLS Files not being renewed

Taranasaur opened this issue · comments

Using cert-manager-csi to mount short-lived Certs signed by Vault. Using the following in my deployment manifest:


  volumes:
    - name: tls
      csi:
        driver: csi.cert-manager.io
        volumeAttributes:
          csi.cert-manager.io/issuer-name: dbc-vault-issuer
          csi.cert-manager.io/duration: 320s
          csi.cert-manager.io/renew-before: 190s
          csi.cert-manager.io/dns-names: redapi.dbclient.com,redapi2.dbclient.com


Packet capture shows the approle login used, and response below:

{
  "request_id": "edfe5ddb-76c6-7b20-9795-5142bd10d725",
  "lease_id": "",
  "renewable": false,     #Assume this is for the AppRole itself
  "lease_duration": 0,
  "data": null,
  "wrap_info": null,
  "warnings": null,
  "auth": {
    "client_token": "s.uQIFnphv9EHAD88gdtCA3Dvm",
    "accessor": "Vs9hLekvYn6hFt0XhNYHM30n",
    "policies": [
      "default",
      "kube-allow-sign"
    ],
    "token_policies": [
      "default",
      "kube-allow-sign"
    ],
    "metadata": {
      "role_name": "kube-role"
    },
    "lease_duration": 300,
    "renewable": true,
    "entity_id": "d8b5ab38-7979-b149-7494-f0c19c5b1e1a",
    "token_type": "service",
    "orphan": true
  }
}

The returned certs:

{
  "request_id": "33094682-002d-6347-f4de-fa4f19b94107",
  "lease_id": "pki_int/sign/dbclients/8jcXTkY4tG28L91d3frYzzu8",
  "renewable": false,
  "lease_duration": 319,
  "data": {
    "ca_chain": [
 {truncated}

The TLS creds are mounted successfully and the cert-manager-csi logs indicate a second request has been made prior to expiry of the certs:

I0129 14:24:23.314302       1 server.go:114] server: request: {"target_path":"/var/lib/kubelet/pods/8b3cb69b-eb44-437c-9e7c-06e9c212c84f/volumes/kubernetes.io~csi/tls/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":1}},"volume_context":{"csi.cert-manager.io/dns-names":"redapi.dbclient.com,redapi2.dbclient.com","csi.cert-manager.io/duration":"320s","csi.cert-manager.io/issuer-name":"dbc-vault-issuer","csi.cert-manager.io/renew-before":"190s","csi.storage.k8s.io/ephemeral":"true","csi.storage.k8s.io/pod.name":"app-example-deployment-7dd587b8b7-hwwjl","csi.storage.k8s.io/pod.namespace":"default","csi.storage.k8s.io/pod.uid":"8b3cb69b-eb44-437c-9e7c-06e9c212c84f","csi.storage.k8s.io/serviceAccount.name":"web"},"volume_id":"csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1"}
I0129 14:24:23.318975       1 nodeserver.go:84] node: created volume: /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:24:23.319012       1 nodeserver.go:86] node: creating key/cert pair with cert-manager: /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:24:23.657499       1 certmanager.go:141] cert-manager: created CertificateRequest csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:24:23.657530       1 certmanager.go:143] cert-manager: waiting for CertificateRequest to become ready csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:24:23.657543       1 certmanager.go:267] cert-manager: polling CertificateRequest csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/default for ready status
I0129 14:24:24.662819       1 certmanager.go:267] cert-manager: polling CertificateRequest csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/default for ready status
I0129 14:24:24.667999       1 certmanager.go:160] cert-manager: metadata written to file /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/metadata.json
I0129 14:24:24.668568       1 certmanager.go:181] cert-manager: certificate written to file /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/data/crt.pem
I0129 14:24:24.668678       1 certmanager.go:188] cert-manager: private key written to file: /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/data/key.pem
I0129 14:24:24.668694       1 renew.go:172] renewer: starting to watch certificate for renewal: "csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1"
I0129 14:24:24.669069       1 nodeserver.go:131] node: publish volume request ~ target:/var/lib/kubelet/pods/8b3cb69b-eb44-437c-9e7c-06e9c212c84f/volumes/kubernetes.io~csi/tls/mount volumeId:csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1 attributes:map[csi.cert-manager.io/ca-file:ca.pem csi.cert-manager.io/certificate-file:crt.pem csi.cert-manager.io/dns-names:redapi.dbclient.com,redapi2.dbclient.com csi.cert-manager.io/duration:320s csi.cert-manager.io/is-ca:false csi.cert-manager.io/issuer-group:cert-manager.io csi.cert-manager.io/issuer-kind:Issuer csi.cert-manager.io/issuer-name:dbc-vault-issuer csi.cert-manager.io/privatekey-file:key.pem csi.cert-manager.io/renew-before:190s csi.storage.k8s.io/ephemeral:true csi.storage.k8s.io/pod.name:app-example-deployment-7dd587b8b7-hwwjl csi.storage.k8s.io/pod.namespace:default csi.storage.k8s.io/pod.uid:8b3cb69b-eb44-437c-9e7c-06e9c212c84f csi.storage.k8s.io/serviceAccount.name:web]
I0129 14:24:24.669152       1 mount.go:68] Mounting cmd (mount) with arguments ([-o bind,ro /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/data /var/lib/kubelet/pods/8b3cb69b-eb44-437c-9e7c-06e9c212c84f/volumes/kubernetes.io~csi/tls/mount])
I0129 14:24:24.677405       1 nodeserver.go:143] node: mount successful default:app-example-deployment-7dd587b8b7-hwwjl:csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:24:24.677438       1 server.go:119] server: response: {}
I0129 14:24:28.168855       1 server.go:113] server: call: /csi.v1.Node/NodeGetCapabilities
I0129 14:24:28.168877       1 server.go:114] server: request: {}
E0129 14:24:28.169495       1 server.go:117] server: error: rpc error: code = Unimplemented desc =
I0129 14:24:36.959593       1 server.go:113] server: call: /csi.v1.Node/NodeUnpublishVolume
I0129 14:24:36.959631       1 server.go:114] server: request: {"target_path":"/var/lib/kubelet/pods/c0be1b32-3af6-4f48-a002-321ee0f503ed/volumes/kubernetes.io~csi/tls/mount","volume_id":"csi-f6efa90d1d8d57da6b3ee148b9db840ab0864dec3ae9d712fbd5ece7c4090761"}
I0129 14:24:36.977444       1 renew.go:207] renewer: killing watcher for "csi-f6efa90d1d8d57da6b3ee148b9db840ab0864dec3ae9d712fbd5ece7c4090761"
I0129 14:24:36.977472       1 mount.go:54] Unmounting /var/lib/kubelet/pods/c0be1b32-3af6-4f48-a002-321ee0f503ed/volumes/kubernetes.io~csi/tls/mount
I0129 14:24:36.985463       1 nodeserver.go:170] node: volume /var/lib/kubelet/pods/c0be1b32-3af6-4f48-a002-321ee0f503ed/volumes/kubernetes.io~csi/tls/mount/csi-f6efa90d1d8d57da6b3ee148b9db840ab0864dec3ae9d712fbd5ece7c4090761 has been unmounted.
I0129 14:24:36.985507       1 nodeserver.go:172] node: deleting volume csi-f6efa90d1d8d57da6b3ee148b9db840ab0864dec3ae9d712fbd5ece7c4090761
I0129 14:24:36.986412       1 server.go:119] server: response: {}
I0129 14:25:09.782289       1 server.go:113] server: call: /csi.v1.Node/NodeGetCapabilities
I0129 14:25:09.782367       1 server.go:114] server: request: {}
E0129 14:25:09.783236       1 server.go:117] server: error: rpc error: code = Unimplemented desc =
I0129 14:25:20.888931       1 server.go:113] server: call: /csi.v1.Node/NodeGetCapabilities
I0129 14:25:20.888960       1 server.go:114] server: request: {}
E0129 14:25:20.890100       1 server.go:117] server: error: rpc error: code = Unimplemented desc =
I0129 14:26:29.866556       1 server.go:113] server: call: /csi.v1.Node/NodeGetCapabilities
I0129 14:26:29.866595       1 server.go:114] server: request: {}
E0129 14:26:29.867498       1 server.go:117] server: error: rpc error: code = Unimplemented desc =
I0129 14:26:33.000903       1 certmanager.go:197] cert-manager: renewing certicate csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:26:34.445308       1 certmanager.go:141] cert-manager: created CertificateRequest csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:26:34.445393       1 certmanager.go:143] cert-manager: waiting for CertificateRequest to become ready csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:26:34.445410       1 certmanager.go:267] cert-manager: polling CertificateRequest csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/default for ready status
I0129 14:26:34.450772       1 certmanager.go:160] cert-manager: metadata written to file /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/metadata.json
I0129 14:26:34.451772       1 certmanager.go:181] cert-manager: certificate written to file /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/data/crt.pem
I0129 14:26:34.452034       1 certmanager.go:188] cert-manager: private key written to file: /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/data/key.pem
E0129 14:26:34.452064       1 renew.go:158] volume already being watched, aborting second watcher: csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1
I0129 14:26:53.022217       1 server.go:113] server: call: /csi.v1.Node/NodeGetCapabilities

The wireshark capture shows no attempt by cert-manager to sign a new certificate or an attempt to login again using the approle. The files written inside /csi-data-dir/csi-be368c20fe7f8a57a200b808062dc31328db59ad75ff38341fd388c8b14a96f1/data/ also do not change (so a new cert isnt being conjured up out of nowhere either)

Details regarding one of the CSI pods:

[root@BifMaster001 vault-helm]# kud cert-manager-csi-tnnx2 -n cert-manager
Name:         cert-manager-csi-tnnx2
Namespace:    cert-manager
Priority:     0
Node:         bifworker001/10.204.5.88
Start Time:   Thu, 23 Jan 2020 17:41:01 +0000
Labels:       app=cert-manager-csi
              controller-revision-hash=5688994456
              pod-template-generation=5
Annotations:  <none>
Status:       Running
IP:           10.244.1.56
IPs:
  IP:           10.244.1.56
Controlled By:  DaemonSet/cert-manager-csi
Containers:
  node-driver-registrar:
    Container ID:  docker://0ab3aaf5c9abcc61273f481dd4ec023a66b89e390092cc681d2c35c98d4e7b4a
    Image:         quay.io/k8scsi/csi-node-driver-registrar:v1.0.2
    Image ID:      docker-pullable://quay.io/k8scsi/csi-node-driver-registrar@sha256:ffecfbe6ae9f446e5102cbf2c73041d63ccf44bcfd72e2f2a62174a3a185eb69
    Port:          <none>
    Host Port:     <none>
    Args:
      --v=5
      --csi-address=/plugin/csi.sock
      --kubelet-registration-path=/var/lib/kubelet/plugins/cert-manager-csi/csi.sock
    State:          Running
      Started:      Thu, 23 Jan 2020 17:41:01 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      KUBE_NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /plugin from plugin-dir (rw)
      /registration from registration-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cert-manager-csi-token-8gpbx (ro)
  cert-manager-csi:
    Container ID:  docker://f4abc35f65634a4331d0e400d4879e6941c995d84b7839203ac10d768e41d150
    Image:         gcr.io/jetstack-josh/cert-manager-csi:v0.1.0-alpha.1
    Image ID:      docker-pullable://gcr.io/jetstack-josh/cert-manager-csi@sha256:ff9027232b275e904e970c6ab92a268183ed9be3b70c56cc6548504484d3bc61
    Port:          <none>
    Host Port:     <none>
    Args:
      --node-id=$(NODE_ID)
      --endpoint=$(CSI_ENDPOINT)
      --data-root=/csi-data-dir
      --v=5
    State:          Running
      Started:      Thu, 23 Jan 2020 17:41:02 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      NODE_ID:        (v1:spec.nodeName)
      CSI_ENDPOINT:  unix://plugin/csi.sock
    Mounts:
      /csi-data-dir from csi-data-dir (rw)
      /plugin from plugin-dir (rw)
      /var/lib/kubelet/pods from pods-mount-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cert-manager-csi-token-8gpbx (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  plugin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/cert-manager-csi
    HostPathType:  DirectoryOrCreate
  pods-mount-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/pods
    HostPathType:  Directory
  registration-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry
    HostPathType:  Directory
  csi-data-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /tmp/cert-manager-csi
    HostPathType:  DirectoryOrCreate
  cert-manager-csi-token-8gpbx:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cert-manager-csi-token-8gpbx
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:          <none>

Additionally as an aside, if we spin up more than one application it appears occasionally the received certs are swapped (e.g Pod B has Pod As issued certs and vice versa). Perhaps raise a separate issue/query regarding that..

Hey, sorry about this. There is a PR to fix this issue as well as making the driver a lot more stable. After that merge we would expect to get the first release out.

/cc @munnerz

Great. Thanks Josh. Look forward to the update :) (Keeping cm-csi in my design for now)

Hi Josh,still experiencing this issue after updating to new images:

 ku get daemonset -n cert-manager cert-manager-csi -o yaml | grep image
        image: quay.io/k8scsi/csi-node-driver-registrar:v1.2.0
        imagePullPolicy: IfNotPresent
        image: gcr.io/jetstack-josh/cert-manager-csi:v0.1.0-alpha.1
        imagePullPolicy: IfNotPresent

ku get replicaset -n cert-manager cert-manager-77bbfd565c -o yaml | grep image
        image: quay.io/jetstack/cert-manager-controller:v0.13.0

Hi @Taranasaur,

Sorry you're having issues still. Have you tried to build an image from source?

That image is my personal one for testing really, and shouldn't be used publicly. We will be releasing v0.1 soon which will live under quay.io/jetstack/cert-manager-csi

I'll take a look at building from the source