topolvm / topolvm

Capacity-aware CSI plugin for Kubernetes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TopoLVM Snapshot Issue: Multiple Logical Volumes Created

koraypinarci opened this issue · comments

First of all, thank you for this fantastic project; it's exactly what I've been looking for. I've tested it on my test system with 1 Master and 1 Worker in Kubernetes.

Environments
Helm Version: v3.12.1
Client Version: v1.27.3
Kustomize Version: v5.0.1
Server Version: v1.27.6

I'd like to incorporate TopoLVM as a cache for the Actions-Runner-Controller for Docker image caching. Therefore, snapshot functionality is essential to me. My goal is to create a PVC, start a pod, bind it to the PVC, pull the necessary images, remove the pod, and create a snapshot of the PVC. Then, I want the Actions-Runner-Controller to dynamically create the TopoLVM from the snapshot.

I've created a volume group on the node, deployed TopoLVM with Helm, created a PVC, created a pod, mounted the PVC to the pod, pulled images, and received the following message when creating the snapshot: "Failed to check and update snapshot content: failed to take a snapshot of the volume 25e63f2f-7fce-4fe9-bad1-f8d074ab574c: "rpc error: code = Unimplemented desc = device class is not thin. Thick snapshots are not implemented yet."

So far, my first config for topolvm looked like this:

# useLegacy -- If true, the legacy plugin name and legacy custom resource group is used(topolvm.cybozu.com).
useLegacy: false

image:
  # image.repository -- TopoLVM image repository to use.
  repository: ghcr.io/topolvm/topolvm-with-sidecar

  # image.tag -- TopoLVM image tag to use.
  # @default -- `{{ .Chart.AppVersion }}`
  tag: #13.0.1

  # image.pullPolicy -- TopoLVM image pullPolicy.
  pullPolicy:  # Always

  # image.pullSecrets -- List of imagePullSecrets.
  pullSecrets: []

  csi:
    # image.csi.nodeDriverRegistrar -- Specify csi-node-driver-registrar: image.
    # If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
    nodeDriverRegistrar:  # registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.2.0

    # image.csi.csiProvisioner -- Specify csi-provisioner image.
    # If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
    csiProvisioner:  # registry.k8s.io/sig-storage/csi-provisioner:v2.2.1

    # image.csi.csiResizer -- Specify csi-resizer image.
    # If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
    csiResizer:  # registry.k8s.io/sig-storage/csi-resizer:v1.2.0

    # image.csi.csiSnapshotter -- Specify csi-snapshot image.
    # If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
    csiSnapshotter:  # registry.k8s.io/sig-storage/csi-snapshotter:v5.0.1

    # image.csi.livenessProbe -- Specify livenessprobe image.
    # If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
    livenessProbe:  # registry.k8s.io/sig-storage/livenessprobe:v2.3.0

# A scheduler extender for TopoLVM
scheduler:
  # scheduler.enabled --  If true, enable scheduler extender for TopoLVM
  enabled: false

  # scheduler.args -- Arguments to be passed to the command.
  args: []

  # scheduler.type -- If you run with a managed control plane (such as GKE, AKS, etc), topolvm-scheduler should be deployed as Deployment and Service.
  # topolvm-scheduler should otherwise be deployed as DaemonSet in unmanaged (i.e. bare metal) deployments.
  # possible values:  daemonset/deployment
  type: daemonset

  # Use only if you choose `scheduler.type` deployment
  deployment:
    # scheduler.deployment.replicaCount -- Number of replicas for Deployment. 
    replicaCount: 1

  # Use only if you choose `scheduler.type` deployment
  service:
    # scheduler.service.type -- Specify Service type.
    type: LoadBalancer
    # scheduler.service.clusterIP -- Specify Service clusterIP.
    clusterIP:  # None
    # scheduler.service.nodePort -- (int) Specify nodePort.
    nodePort:  # 30251

  # scheduler.updateStrategy -- Specify updateStrategy on the Deployment or DaemonSet.
  updateStrategy: {}
  #  rollingUpdate:
  #    maxUnavailable: 1
  #  type: RollingUpdate

  # scheduler.terminationGracePeriodSeconds -- (int) Specify terminationGracePeriodSeconds on the Deployment or DaemonSet.
  terminationGracePeriodSeconds:  # 30

  # scheduler.minReadySeconds -- (int) Specify minReadySeconds on the Deployment or DaemonSet.
  minReadySeconds:  # 0

  # scheduler.affinity -- Specify affinity on the Deployment or DaemonSet.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: node-role.kubernetes.io/control-plane
                operator: Exists

  podDisruptionBudget:
    # scheduler.podDisruptionBudget.enabled -- Specify podDisruptionBudget enabled.
    enabled: true

  # scheduler.tolerations -- Specify tolerations on the Deployment or DaemonSet.
  ## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
    - key: node-role.kubernetes.io/control-plane
      effect: NoSchedule
    # node-role.kubernetes.io/master taint will not be used in k8s 1.25+.
    # cf. https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md
    # TODO: remove this when minimum supported version becomes 1.25.
    - key: node-role.kubernetes.io/master
      effect: NoSchedule

  # scheduler.nodeSelector -- Specify nodeSelector on the Deployment or DaemonSet.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
  nodeSelector: {}

  # scheduler.priorityClassName -- Specify priorityClassName on the Deployment or DaemonSet.
  priorityClassName:

  # scheduler.schedulerOptions -- Tune the Node scoring.
  # ref: https://github.com/topolvm/topolvm/blob/master/deploy/README.md
  schedulerOptions: {}
  #  default-divisor: 1
  #  divisors:
  #    ssd: 1
  #    hdd: 10

  options:
    listen:
      # scheduler.options.listen.host -- Host used by Probe.
      host: localhost
      # scheduler.options.listen.port -- Listen port.
      port: 9251

  # scheduler.podLabels -- Additional labels to be set on the scheduler pods.
  podLabels: {}
  # scheduler.labels -- Additional labels to be added to the Deployment or Daemonset.
  labels: {}

# lvmd service
lvmd:
  # lvmd.managed -- If true, set up lvmd service with DaemonSet.
  managed: true

  # lvmd.socketName -- Specify socketName.
  socketName: /run/topolvm/lvmd.sock

  # lvmd.deviceClasses -- Specify the device-class settings.
  deviceClasses:
    - name: localcache-hdd
      volume-group: vg_localcache
      default: true
      spare-gb: 10

  # lvmd.lvcreateOptionClasses -- Specify the lvcreate-option-class settings.
  lvcreateOptionClasses: []
   #- name: localcache-hdd
   #  options:
   #    - --thin

  # lvmd.args -- Arguments to be passed to the command.
  args: []

  # lvmd.priorityClassName -- Specify priorityClassName.
  priorityClassName:

  # lvmd.tolerations -- Specify tolerations.
  ## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
    - key: node-role.kubernetes.io/control-plane
      effect: NoSchedule

  # lvmd.nodeSelector -- Specify nodeSelector.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
  nodeSelector: {}

  # lvmd.affinity -- Specify affinity.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  affinity: {}

  # lvmd.volumes -- Specify volumes.
  volumes: []
  #  - name: lvmd-socket-dir
  #    hostPath:
  #      path: /run/topolvm
  #      type: DirectoryOrCreate

  # lvmd.volumeMounts -- Specify volumeMounts.
  volumeMounts: []
  #  - name: lvmd-socket-dir
  #    mountPath: /run/topolvm

  # lvmd.env -- extra environment variables
  env: []
  #  - name: LVM_SYSTEM_DIR
  #    value: /tmp

  # lvmd.additionalConfigs -- Define additional LVM Daemon configs if you have additional types of nodes.
  # Please ensure nodeSelectors are non overlapping.
  additionalConfigs: []
  #  - tolerations: []
  #      nodeSelector: {}
  #      device-classes:
  #        - name: ssd
  #          volume-group: myvg2
  #          default: true
  #          spare-gb: 10

  # lvmd.updateStrategy -- Specify updateStrategy.
  updateStrategy: {}
  #  type: RollingUpdate
  #  rollingUpdate:
  #    maxSurge: 50%
  #    maxUnavailable: 50%

  # lvmd.podLabels -- Additional labels to be set on the lvmd service pods.
  podLabels: {}
  # lvmd.labels -- Additional labels to be added to the Daemonset.
  labels: {}

  # lvmd.initContainers -- Additional initContainers for the lvmd service.
  initContainers: []

# CSI node service
node:
  # node.lvmdEmbedded -- Specify whether to embed lvmd in the node container.
  # Should not be used in conjunction with lvmd.managed otherwise lvmd will be started twice.
  lvmdEmbedded: false
  # node.lvmdSocket -- Specify the socket to be used for communication with lvmd.
  lvmdSocket: /run/topolvm/lvmd.sock
  # node.kubeletWorkDirectory -- Specify the work directory of Kubelet on the host.
  # For example, on microk8s it needs to be set to `/var/snap/microk8s/common/var/lib/kubelet`
  kubeletWorkDirectory: /var/lib/kubelet

  # node.args -- Arguments to be passed to the command.
  args: []

  # node.securityContext. -- Container securityContext.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
  securityContext:
    privileged: true

  metrics:
    # node.metrics.enabled -- If true, enable scraping of metrics by Prometheus.
    enabled: true
    # node.metrics.annotations -- Annotations for Scrape used by Prometheus.
    annotations:
      prometheus.io/port: metrics

  prometheus:
    podMonitor:
      # node.prometheus.podMonitor.enabled -- Set this to `true` to create PodMonitor for Prometheus operator.
      enabled: false

      # node.prometheus.podMonitor.additionalLabels -- Additional labels that can be used so PodMonitor will be discovered by Prometheus.
      additionalLabels: {}

      # node.prometheus.podMonitor.namespace -- Optional namespace in which to create PodMonitor.
      namespace: ""

      # node.prometheus.podMonitor.interval -- Scrape interval. If not set, the Prometheus default scrape interval is used.
      interval: ""

      # node.prometheus.podMonitor.scrapeTimeout -- Scrape timeout. If not set, the Prometheus default scrape timeout is used.
      scrapeTimeout: ""

      # node.prometheus.podMonitor.relabelings -- RelabelConfigs to apply to samples before scraping.
      relabelings: []
      # - sourceLabels: [__meta_kubernetes_service_label_cluster]
      #   targetLabel: cluster
      #   regex: (.*)
      #   replacement: ${1}
      #   action: replace

      # node.prometheus.podMonitor.metricRelabelings -- MetricRelabelConfigs to apply to samples before ingestion.
      metricRelabelings: []
      # - sourceLabels: [__meta_kubernetes_service_label_cluster]
      #   targetLabel: cluster
      #   regex: (.*)
      #   replacement: ${1}
      #   action: replace

  # node.priorityClassName -- Specify priorityClassName.
  priorityClassName:

  # node.tolerations -- Specify tolerations.
  ## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
    - key: node-role.kubernetes.io/control-plane
      effect: NoSchedule

  # node.nodeSelector -- Specify nodeSelector.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
  nodeSelector: {}

  # node.affinity -- Specify affinity.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  affinity: {}

  # node.volumes -- Specify volumes.
  volumes: []
  #  - name: registration-dir
  #    hostPath:
  #      path: /var/lib/kubelet/plugins_registry/
  #      type: Directory
  #  - name: node-plugin-dir
  #    hostPath:
  #      path: /var/lib/kubelet/plugins/topolvm.io/node
  #      type: DirectoryOrCreate
  #  - name: csi-plugin-dir
  #    hostPath:
  #      path: /var/lib/kubelet/plugins/kubernetes.io/csi
  #      type: DirectoryOrCreate
  #  - name: pod-volumes-dir
  #    hostPath:
  #      path: /var/lib/kubelet/pods/
  #      type: DirectoryOrCreate
  #  - name: lvmd-socket-dir
  #    hostPath:
  #      path: /run/topolvm
  #      type: Directory

  volumeMounts:
    # node.volumeMounts.topolvmNode -- Specify volumes.
    topolvmNode: []
    # - name: node-plugin-dir
    #   mountPath: /var/lib/kubelet/plugins/topolvm.io/node
    # - name: csi-plugin-dir
    #   mountPath: /var/lib/kubelet/plugins/kubernetes.io/csi
    #   mountPropagation: "Bidirectional"
    # - name: pod-volumes-dir
    #   mountPath: /var/lib/kubelet/pods
    #   mountPropagation: "Bidirectional"
    # - name: lvmd-socket-dir
    #   mountPath: /run/topolvm

  # node.updateStrategy -- Specify updateStrategy.
  updateStrategy: {}
  #  type: RollingUpdate
  #  rollingUpdate:
  #    maxSurge: 50%
  #    maxUnavailable: 50%

  # node.podLabels -- Additional labels to be set on the node pods.
  podLabels: {}
  # node.labels -- Additional labels to be added to the Daemonset.
  labels: {}

  # node.initContainers -- Additional initContainers for the node service.
  initContainers: []

# CSI controller service
controller:
  # controller.replicaCount -- Number of replicas for CSI controller service.
  replicaCount: 1

  # controller.args -- Arguments to be passed to the command.
  args: []

  storageCapacityTracking:
    # controller.storageCapacityTracking.enabled -- Enable Storage Capacity Tracking for csi-provisioner.
    enabled: true

  securityContext:
    # controller.securityContext.enabled -- Enable securityContext.
    enabled: true

  nodeFinalize:
    # controller.nodeFinalize.skipped -- Skip automatic cleanup of PhysicalVolumeClaims when a Node is deleted.
    skipped: false

  prometheus:
    podMonitor:
      # controller.prometheus.podMonitor.enabled -- Set this to `true` to create PodMonitor for Prometheus operator.
      enabled: false

      # controller.prometheus.podMonitor.additionalLabels -- Additional labels that can be used so PodMonitor will be discovered by Prometheus.
      additionalLabels: {}

      # controller.prometheus.podMonitor.namespace -- Optional namespace in which to create PodMonitor.
      namespace: ""

      # controller.prometheus.podMonitor.interval -- Scrape interval. If not set, the Prometheus default scrape interval is used.
      interval: ""

      # controller.prometheus.podMonitor.scrapeTimeout -- Scrape timeout. If not set, the Prometheus default scrape timeout is used.
      scrapeTimeout: ""

      # controller.prometheus.podMonitor.relabelings -- RelabelConfigs to apply to samples before scraping.
      relabelings: []
      # - sourceLabels: [__meta_kubernetes_service_label_cluster]
      #   targetLabel: cluster
      #   regex: (.*)
      #   replacement: ${1}
      #   action: replace

      # controller.prometheus.podMonitor.metricRelabelings -- MetricRelabelConfigs to apply to samples before ingestion.
      metricRelabelings: []
      # - sourceLabels: [__meta_kubernetes_service_label_cluster]
      #   targetLabel: cluster
      #   regex: (.*)
      #   replacement: ${1}
      #   action: replace

  # controller.terminationGracePeriodSeconds -- (int) Specify terminationGracePeriodSeconds.
  terminationGracePeriodSeconds:  # 10

  # controller.priorityClassName -- Specify priorityClassName.
  priorityClassName:

  # controller.updateStrategy -- Specify updateStrategy.
  updateStrategy: {}
  #  type: RollingUpdate
  #  rollingUpdate:
  #    maxSurge: 50%
  #    maxUnavailable: 50%

  # controller.minReadySeconds -- (int) Specify minReadySeconds.
  minReadySeconds:  # 0

  # controller.affinity -- Specify affinity.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  affinity: |
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
              - key: app.kubernetes.io/component
                operator: In
                values:
                  - controller
              - key: app.kubernetes.io/name
                operator: In
                values:
                  - {{ include "topolvm.name" . }}
          topologyKey: kubernetes.io/hostname

  # controller.tolerations -- Specify tolerations.
  ## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
    - key: CriticalAddonsOnly
      operator: Exists
    - key: node-role.kubernetes.io/control-plane
      effect: NoSchedule

  # controller.nodeSelector -- Specify nodeSelector.
  ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
  nodeSelector: {}

  # controller.volumes -- Specify volumes.
  volumes:
    - name: socket-dir
      emptyDir: {}

  podDisruptionBudget:
    # controller.podDisruptionBudget.enabled -- Specify podDisruptionBudget enabled.
    enabled: true

  # controller.podLabels -- Additional labels to be set on the controller pod.
  podLabels: {}
  # controller.labels -- Additional labels to be added to the Deployment.
  labels: {}

  # controller.initContainers -- Additional initContainers for the controller service.
  initContainers: []

resources:
  # resources.topolvm_node -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  topolvm_node: {}
  #  requests:
  #    memory: 100Mi
  #    cpu: 100m
  #  limits:
  #    memory: 500Mi
  #    cpu: 500m
  # resources.csi_registrar -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  csi_registrar: {}
  # requests:
  #   cpu: "25m"
  #   memory: "10Mi"
  # limits:
  #   cpu: "200m"
  #   memory: "200Mi"
  # resources.liveness_probe -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  liveness_probe: {}
  # requests:
  #   cpu: "25m"
  #   memory: "10Mi"
  # limits:
  #   cpu: "200m"
  #   memory: "200Mi"
  # resources.topolvm_controller -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  topolvm_controller: {}
  #  requests:
  #    memory: "50Mi"
  #    cpu: "50m"
  #  limits:
  #    memory: "200Mi"
  #    cpu: "200m"
  # resources.csi_provisioner -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  csi_provisioner: {}
  #  requests:
  #    memory: "50Mi"
  #    cpu: "50m"
  #  limits:
  #    memory: "200Mi"
  #    cpu: "200m"
  # resources.csi_resizer -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  csi_resizer: {}
  #  requests:
  #    memory: "50Mi"
  #    cpu: "50m"
  #  limits:
  #    memory: "200Mi"
  #    cpu: "200m"
  # resources.csi_snapshotter -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  csi_snapshotter: {}
  #  requests:
  #    memory: "50Mi"
  #    cpu: "50m"
  #  limits:
  #    memory: "200Mi"
  #    cpu: "200m"
  # resources.lvmd -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  lvmd: {}
  #  requests:
  #    memory: 100Mi
  #    cpu: 100m
  #  limits:
  #    memory: 500Mi
  #    cpu: 500m
  # resources.topolvm_scheduler -- Specify resources.
  ## ref: https://kubernetes.io/docs/user-guide/compute-resources/
  topolvm_scheduler: {}
  #  requests:
  #    memory: "50Mi"
  #    cpu: "50m"
  #  limits:
  #    memory: "200Mi"
  #    cpu: "200m"

env:
  # env.topolvm_node -- Specify environment variables for topolvm_node container.
  topolvm_node: []
  # env.csi_registrar -- Specify environment variables for csi_registrar container.
  csi_registrar: []
  # env.liveness_probe -- Specify environment variables for liveness_probe container.
  liveness_probe: []
  # env.topolvm_controller -- Specify environment variables for topolvm_controller container.
  topolvm_controller: []
  # env.csi_provisioner -- Specify environment variables for csi_provisioner container.
  csi_provisioner: []
  # env.csi_resizer -- Specify environment variables for csi_resizer container.
  csi_resizer: []
  # env.csi_snapshotter -- Specify environment variables for csi_snapshotter container.
  csi_snapshotter: []
  # To specify environment variables for lvmd, use lvmd.env instead.
  # lvmd: []
  # env.topolvm_scheduler -- Specify environment variables for topolvm_scheduler container.
  topolvm_scheduler: []

livenessProbe:
  # livenessProbe.topolvm_node -- Specify resources.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
  topolvm_node:
    failureThreshold:
    initialDelaySeconds: 10
    timeoutSeconds: 3
    periodSeconds: 60
  # livenessProbe.csi_registrar -- Specify livenessProbe.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
  csi_registrar:
    failureThreshold:
    initialDelaySeconds: 10
    timeoutSeconds: 3
    periodSeconds: 60
  # livenessProbe.topolvm_controller -- Specify livenessProbe.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
  topolvm_controller:
    failureThreshold:
    initialDelaySeconds: 10
    timeoutSeconds: 3
    periodSeconds: 60
  # livenessProbe.lvmd -- Specify livenessProbe.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
  lvmd:
    failureThreshold:
    initialDelaySeconds: 10
    timeoutSeconds: 3
    periodSeconds: 60
  # livenessProbe.topolvm_scheduler -- Specify livenessProbe.
  ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
  topolvm_scheduler:
    failureThreshold:
    initialDelaySeconds: 10
    timeoutSeconds: 3
    periodSeconds: 60

# storageClasses -- Whether to create storageclass(es)
# ref: https://kubernetes.io/docs/concepts/storage/storage-classes/
storageClasses:
  - name: topolvm-provisioner  # Defines name of storage class.
    storageClass:
      # Supported filesystems are: ext4, xfs, and btrfs.
      fsType: ext4
      # reclaimPolicy
      reclaimPolicy:  # Delete
      # Additional annotations
      annotations: {}
      # Default storage class for dynamic volume provisioning
      # ref: https://kubernetes.io/docs/concepts/storage/dynamic-provisioning
      isDefaultClass: false
      # volumeBindingMode can be either WaitForFirstConsumer or Immediate. WaitForFirstConsumer is recommended because TopoLVM cannot schedule pods wisely if volumeBindingMode is Immediate.
      volumeBindingMode: WaitForFirstConsumer #Immediate 
      # enables CSI drivers to expand volumes. This feature is available for Kubernetes 1.16 and later releases.
      allowVolumeExpansion: true
      additionalParameters: {}
      topolvm.io/device-class: "localcache-hdd"

webhook:
  existingCertManagerIssuer: {}
  #  group: cert-manager.io/v1
  #  kind: ClusterIssuer
  #  name: letsencrypt-prod
  podMutatingWebhook:
    # webhook.podMutatingWebhook.enabled -- Enable Pod MutatingWebhook.
    enabled: false
  pvcMutatingWebhook:
    # webhook.pvcMutatingWebhook.enabled -- Enable PVC MutatingWebhook.
    enabled: true

# Container Security Context
# ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
securityContext:
  # securityContext.runAsUser -- Specify runAsUser.
  runAsUser: 10000
  # securityContext.runAsGroup -- Specify runAsGroup.
  runAsGroup: 10000

cert-manager:
  # cert-manager.enabled -- Install cert-manager together.
  ## ref: https://cert-manager.io/docs/installation/kubernetes/#installing-with-helm
  enabled: true

priorityClass:
  # priorityClass.enabled -- Install priorityClass.
  enabled: true
  # priorityClass.name -- Specify priorityClass resource name.
  name: topolvm
  # priorityClass.value  -- Specify priorityClass value.
  value: 1000000

snapshot:
  # snapshot.enabled -- Turn on the snapshot feature.
  enabled: true

I unfortunately don't have control over disk provisioning, but I can create logical volumes with thin provisioning using the lvcreate --thin ... command. Therefore, I included it in the config like this:

  deviceClasses:
    - name: localcache-hdd
      volume-group: vg_localcache
      default: true
      spare-gb: 10
      lvcreate-options:   
        - --thin

I re-deployed TopoLVM and started from scratch.

  1. Created PVC
  2. Created a container (PVC couldn't be attached to the pod) :(

Description of PVC:

Events:
  Type     Reason               Age                      From                                                                                 Message
  ----     ------               ----                     ----                                                                                 -------
  Warning  ProvisioningFailed   3m27s (x8 over 3h12m)    topolvm.io_topolvm-controller-5b446dc55d-ks9sd_820a7725-8a58-4089-a564-575cf399b91e  failed to provision volume with StorageClass "topolvm-provisioner": rpc error: code = Internal desc = not found
  Normal   WaitForPodScheduled  2m33s (x811 over 3h12m)  persistentvolume-controller                                                          waiting for pod dind to be scheduled
  Warning  ProvisioningFailed   117s (x57 over 3h12m)    topolvm.io_topolvm-controller-5b446dc55d-ks9sd_820a7725-8a58-4089-a564-575cf399b91e  failed to provision volume with StorageClass "topolvm-provisioner": rpc error: code = ResourceExhausted desc = no enough space left on VG: free=10670309376, requested=10737418240
  Normal   Provisioning         106s (x66 over 3h12m)    topolvm.io_topolvm-controller-5b446dc55d-ks9sd_820a7725-8a58-4089-a564-575cf399b91e  External provisioner is provisioning volume for claim "docker/localcache-pvc"

Log from topolvm-lvmd:

{"level":"info","ts":"2024-01-18T12:58:09Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:09Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:10Z","msg":"invoking LVM command","args":["lvcreate","-n","381debf8-d1e0-4cf9-acf5-c414d7af82d7","-L","10737418240b","-W","y","-y","--thin","vg_localcache"]}
{"level":"info","ts":"2024-01-18T12:58:10Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"error","ts":"2024-01-18T12:58:10Z","msg":"failed to create volume","name":"381debf8-d1e0-4cf9-acf5-c414d7af82d7","requested":10737418240,"tags":[],"error":"not found","stacktrace":"github.com/topolvm/topolvm/lvmd.(*lvService).CreateLV\n\t/workdir/lvmd/lvservice.go:122\ngithub.com/topolvm/topolvm/lvmd/proto._LVService_CreateLV_Handler\n\t/workdir/lvmd/proto/lvmd_grpc.pb.go:127\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1374\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1751\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:986"}
{"level":"info","ts":"2024-01-18T12:58:10Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:18Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:18Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"error","ts":"2024-01-18T12:58:18Z","msg":"not enough space left on VG","name":"e6c6985f-b781-493e-b818-780cb9afec57","free":10670309376,"requested":10737418240,"stacktrace":"github.com/topolvm/topolvm/lvmd.(*lvService).CreateLV\n\t/workdir/lvmd/lvservice.go:90\ngithub.com/topolvm/topolvm/lvmd/proto._LVService_CreateLV_Handler\n\t/workdir/lvmd/proto/lvmd_grpc.pb.go:127\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1374\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1751\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:986"}

And when I check lvs on the node, I see:

root@runner-test-wrk-1:~# lvs
  LV                                   VG            Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  03421acb-f56f-4dc2-9ae5-f39debd478ee vg_localcache twi-a-tz-- 10.00g             0.00   10.61
  6b846237-1867-411d-8450-c6123c36ac10 vg_localcache twi-a-tz-- 10.00g             0.00   10.61
  7d46aed5-4df0-4eb1-92a0-55e03d6f6e01 vg_localcache twi-a-tz-- 10.00g             0.00   10.61
  d9070b16-3d09-4073-a2b5-eeb95e038bee vg_localcache twi-a-tz-- 10.00g             0.00   10.61

I can't figure out why TopoLVM is creating multiple logical volumes. What exactly is going wrong here?

Hi Koray! the reason you are seeing this is because of your LVMD config:

  deviceClasses:
    - name: localcache-hdd
      volume-group: vg_localcache
      default: true
      spare-gb: 10
      lvcreate-options:   
        - --thin

is not the right way to initiate a thin pool.

  deviceClasses:
    - name: localcache-hdd
      volume-group: vg_localcache
      default: true
      spare-gb: 10
      type: thin
      thin-pool:
         name: thin-pool-1
         overprovision-ratio: 2 //arbitrary

@daichimukai maybe it makes sense to validate that --thin is not part of lvcreate-options and throw an error instead?

@koraypinarci Hi. I'm not sure why the multiple LVs were created in your settings, but could you try the following steps to create a snapshot?

  1. Before deploying TopoLVM, run lvcreate --thin ... command on your own. TopoLVM doesn't create a thin pool (ref.), so you need to create one. See also the corresponding code in our e2e test.
  2. Modify your values.yaml to define a device class for the thin pool. You need to specify type: thin and thin-pool field. See our e2e test.
  3. Define a new StorageClass using the device class you defined in the previous step. Use additionalParameters here. See also here.
  4. Apply a PVC using the the StorageClass you defined, bind a Pod to the PVC, and take a snapshot of it.

@jakobmoellerdev I think lvcreate-options is a kind of escape hatch for users and is expected not to be validated at all, as the document says. Instead, we may need more docs for snapshots like the one above.

@ushitora-anqou the code in topolvm does not work properly for thin pools referenced with lvcreate because all of the Thin pool checks are based on the property in lvmd, not on the lvcreate option. I think this is the reason for the misbehaving codeline. The test that you linked only shows the vg/lv creation but the lvmd associated with it shows that we need this different format:

topolvm/e2e/lvmd1.yaml

Lines 20 to 23 in f9c44db

type: thin
thin-pool:
name: "pool0"
overprovision-ratio: 5.0

I do agree that this can be used as "escape hatch" for RAID etc. but not for thin provisioning since its already part of TopoLVMs API

I agree that lvcreate-options is not intended for thin provisioning, and the properties of lvmd should be edited to use it.

However, the idea of validating lvcreate-options doesn't seem to be an appropriate option here. My concerns are as follows:

  • The TopoLVM documentation already says that lvcreate-options should be used at users' own risk. I'd like to keep it that way.
  • Validating lvcreate-options in a sound and complete way seems very difficult to me. That is, I want to avoid annoying situations where TopoLVM lvcreate-options doesn't support some strings that lvcreate does, or vice versa. For example, naive string matching for --thin and -T won't work if a complex command-line argument is used such as -vT. To avoid such situations, we need to parse the command-line options passed through lvcreate-options in exactly the same way as lvcreate. But this will be difficult because lvcreate uses getopt(3) to parse the arguments, but we can't use it because we're writing Go (without cgo).

It seems to me that this issue (#827) was raised because TopoLVM currently lacks documentation for thin provisioning and snapshots. So, I'd like to suggest adding a short step-by-step tutorial that will be sufficient for users to start using snapshots.

Hello everyone,

Thank you for your prompt response.

It took me some time to implement and test the configuration you recommended. I'm happy to report that TopoLVM is performing exactly as I envisioned. Thank you once again for your support. I'd like to share my configuration here in case someone else encounters the same issue, to help them find a solution more quickly. Therefore, some aspects might be repetitive.

Here's what I've changed in my setup:

  1. As @jakobmoellerdev suggested, I adjusted my deviceClasses:
  deviceClasses:
    - name: localcache-hdd-thin
      volume-group: vg_localcache
      default: true
      spare-gb: 5
      type: thin
      thin-pool:
        name: thin-pool-1
        overprovision-ratio: 2
  1. Then, I created a new thin pool on the node. @ushitora-anqou, thank you for the tip.

List diks

lsblk

First step create Phisical Volume

pvcreate /dev/sdb

Second step create Volume Groups

vgcreate vg_localcache /dev/sdb

Create thin-pool

lvcreate -T -n thin-pool-1 -L 45G vg_localcache
  1. I adjusted my StorageClass
storageClasses:
  - name: topolvm-provisioner  # Defines name of storage class.
    storageClass:
      fsType: ext4
      reclaimPolicy:  # Delete
      annotations: {}
      isDefaultClass: false
      volumeBindingMode: WaitForFirstConsumer
      allowVolumeExpansion: true
      additionalParameters: {}
      topolvm.io/device-class: "localcache-hdd-thin"
  - name: topolvm-provisioner-thin
    storageClass:
      fsType: xfs
      isDefaultClass: true
      volumeBindingMode: WaitForFirstConsumer
      allowVolumeExpansion: true
      additionalParameters:
        '{{ include "topolvm.pluginName" . }}/device-class': "localcache-hdd-thin"
  1. Redeployed TopoLVM with Helm
#!/bin/bash

set -euo pipefail

TOPVLM_CONTEXT=ghrunner-test
TOPOLVM_NAMESPACE=topolvm-system
TOPPLVM_INSTALLATION_NAME=topolvm
TOPOLVM_VALUES_FILE=/topolvm/kube-system/values.yaml
KUBECONFIG_PATH=~/.kube/config.yaml
TOPOLVM_CHART_VERION=13.0.1

# Change context
kubectl config use-context "${TOPVLM_CONTEXT}"  

# Add helm repo for topolvm
helm repo add topolvm https://topolvm.github.io/topolvm \
  && helm repo update


# Crate namespace
kubectl --kubeconfig "${KUBECONFIG_PATH}" create namespace "${TOPOLVM_NAMESPACE}" || true
kubectl --kubeconfig "${KUBECONFIG_PATH}" label namespace "${TOPOLVM_NAMESPACE}" topolvm.io/webhook=ignore \
  && kubectl --kubeconfig "${KUBECONFIG_PATH}" create namespace kube-system || true 
     kubectl --kubeconfig "${KUBECONFIG_PATH}" label namespace kube-system topolvm.io/webhook=ignore


# Install topoLVM with the release name
helm --kubeconfig "${KUBECONFIG_PATH}" upgrade --install "${TOPPLVM_INSTALLATION_NAME}" \
     --namespace "${TOPOLVM_NAMESPACE}" \
     --create-namespace \
     --values "${TOPOLVM_VALUES_FILE}" \
     --version "${TOPOLVM_CHART_VERION}" \
     --debug \
     topolvm/topolvm


# Check if topoLVM is running. All topoLVM pods should be running.
kubectl --kubeconfig "${KUBECONFIG_PATH}" get pod --namespace "${TOPOLVM_NAMESPACE}"
  1. This is a simple script that saves me the effort of deploying resources individually.
#!/bin/bash

set -euo pipefail

# Configuration variables
TOPVLM_CONTEXT=ghrunner-test
KUBECONFIG_PATH=~/.kube/config.yaml
CONFI_DIR=/topolvm/custom
TOPOLVM_NAMESPACE=default

# Function to apply Kubernetes configurations
apply_config() {
    echo "🚀 Applying $1..."
    kubectl --kubeconfig "${KUBECONFIG_PATH}" apply -f "${CONFI_DIR}/$1"
    sleep 10
}

# Function to delete Kubernetes configurations
delete_config() {
    echo "💀 Deleting $1..."
    kubectl --kubeconfig "${KUBECONFIG_PATH}" delete -f "${CONFI_DIR}/$1"
    sleep 10
}

# Create PVC
apply_config "pvc.yaml"

# Create initial Pod
apply_config "dind-init-pod.yaml"

# Pull Docker images
echo "🐋 Pulling images"
kubectl --kubeconfig "${KUBECONFIG_PATH}" exec -n "${TOPOLVM_NAMESPACE}" dind -- docker pull ubuntu
sleep 10

# Create SnapshotClass
apply_config "snapshotclass.yaml"

# Remove initial Pod
delete_config "dind-init-pod.yaml"

# Create Snapshot
apply_config "snapshot.yaml"

# Remove original PVC
delete_config "pvc.yaml"

# Start second Pod and bind PVC
apply_config "dind-2-pod.yaml"

# Start third Pod
apply_config "dind-3-pod.yaml"

With this, I was able to create a snapshot from a PVC and use the snapshot to create new PVCs and bind them to my two pods.
This was exactly what I wanted to do. Thank you very much.

I'm glad to hear that! I've opened another PR #833 to track an additional document for Snapshot&Restore, so I'll close this issue.