TopoLVM Snapshot Issue: Multiple Logical Volumes Created
koraypinarci opened this issue · comments
First of all, thank you for this fantastic project; it's exactly what I've been looking for. I've tested it on my test system with 1 Master and 1 Worker in Kubernetes.
Environments
Helm Version: v3.12.1
Client Version: v1.27.3
Kustomize Version: v5.0.1
Server Version: v1.27.6
I'd like to incorporate TopoLVM as a cache for the Actions-Runner-Controller for Docker image caching. Therefore, snapshot functionality is essential to me. My goal is to create a PVC, start a pod, bind it to the PVC, pull the necessary images, remove the pod, and create a snapshot of the PVC. Then, I want the Actions-Runner-Controller to dynamically create the TopoLVM from the snapshot.
I've created a volume group on the node, deployed TopoLVM with Helm, created a PVC, created a pod, mounted the PVC to the pod, pulled images, and received the following message when creating the snapshot: "Failed to check and update snapshot content: failed to take a snapshot of the volume 25e63f2f-7fce-4fe9-bad1-f8d074ab574c: "rpc error: code = Unimplemented desc = device class is not thin. Thick snapshots are not implemented yet."
So far, my first config for topolvm looked like this:
# useLegacy -- If true, the legacy plugin name and legacy custom resource group is used(topolvm.cybozu.com).
useLegacy: false
image:
# image.repository -- TopoLVM image repository to use.
repository: ghcr.io/topolvm/topolvm-with-sidecar
# image.tag -- TopoLVM image tag to use.
# @default -- `{{ .Chart.AppVersion }}`
tag: #13.0.1
# image.pullPolicy -- TopoLVM image pullPolicy.
pullPolicy: # Always
# image.pullSecrets -- List of imagePullSecrets.
pullSecrets: []
csi:
# image.csi.nodeDriverRegistrar -- Specify csi-node-driver-registrar: image.
# If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
nodeDriverRegistrar: # registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.2.0
# image.csi.csiProvisioner -- Specify csi-provisioner image.
# If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
csiProvisioner: # registry.k8s.io/sig-storage/csi-provisioner:v2.2.1
# image.csi.csiResizer -- Specify csi-resizer image.
# If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
csiResizer: # registry.k8s.io/sig-storage/csi-resizer:v1.2.0
# image.csi.csiSnapshotter -- Specify csi-snapshot image.
# If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
csiSnapshotter: # registry.k8s.io/sig-storage/csi-snapshotter:v5.0.1
# image.csi.livenessProbe -- Specify livenessprobe image.
# If not specified, `ghcr.io/topolvm/topolvm-with-sidecar:{{ .Values.image.tag }}` will be used.
livenessProbe: # registry.k8s.io/sig-storage/livenessprobe:v2.3.0
# A scheduler extender for TopoLVM
scheduler:
# scheduler.enabled -- If true, enable scheduler extender for TopoLVM
enabled: false
# scheduler.args -- Arguments to be passed to the command.
args: []
# scheduler.type -- If you run with a managed control plane (such as GKE, AKS, etc), topolvm-scheduler should be deployed as Deployment and Service.
# topolvm-scheduler should otherwise be deployed as DaemonSet in unmanaged (i.e. bare metal) deployments.
# possible values: daemonset/deployment
type: daemonset
# Use only if you choose `scheduler.type` deployment
deployment:
# scheduler.deployment.replicaCount -- Number of replicas for Deployment.
replicaCount: 1
# Use only if you choose `scheduler.type` deployment
service:
# scheduler.service.type -- Specify Service type.
type: LoadBalancer
# scheduler.service.clusterIP -- Specify Service clusterIP.
clusterIP: # None
# scheduler.service.nodePort -- (int) Specify nodePort.
nodePort: # 30251
# scheduler.updateStrategy -- Specify updateStrategy on the Deployment or DaemonSet.
updateStrategy: {}
# rollingUpdate:
# maxUnavailable: 1
# type: RollingUpdate
# scheduler.terminationGracePeriodSeconds -- (int) Specify terminationGracePeriodSeconds on the Deployment or DaemonSet.
terminationGracePeriodSeconds: # 30
# scheduler.minReadySeconds -- (int) Specify minReadySeconds on the Deployment or DaemonSet.
minReadySeconds: # 0
# scheduler.affinity -- Specify affinity on the Deployment or DaemonSet.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
podDisruptionBudget:
# scheduler.podDisruptionBudget.enabled -- Specify podDisruptionBudget enabled.
enabled: true
# scheduler.tolerations -- Specify tolerations on the Deployment or DaemonSet.
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
# node-role.kubernetes.io/master taint will not be used in k8s 1.25+.
# cf. https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/2067-rename-master-label-taint/README.md
# TODO: remove this when minimum supported version becomes 1.25.
- key: node-role.kubernetes.io/master
effect: NoSchedule
# scheduler.nodeSelector -- Specify nodeSelector on the Deployment or DaemonSet.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeSelector: {}
# scheduler.priorityClassName -- Specify priorityClassName on the Deployment or DaemonSet.
priorityClassName:
# scheduler.schedulerOptions -- Tune the Node scoring.
# ref: https://github.com/topolvm/topolvm/blob/master/deploy/README.md
schedulerOptions: {}
# default-divisor: 1
# divisors:
# ssd: 1
# hdd: 10
options:
listen:
# scheduler.options.listen.host -- Host used by Probe.
host: localhost
# scheduler.options.listen.port -- Listen port.
port: 9251
# scheduler.podLabels -- Additional labels to be set on the scheduler pods.
podLabels: {}
# scheduler.labels -- Additional labels to be added to the Deployment or Daemonset.
labels: {}
# lvmd service
lvmd:
# lvmd.managed -- If true, set up lvmd service with DaemonSet.
managed: true
# lvmd.socketName -- Specify socketName.
socketName: /run/topolvm/lvmd.sock
# lvmd.deviceClasses -- Specify the device-class settings.
deviceClasses:
- name: localcache-hdd
volume-group: vg_localcache
default: true
spare-gb: 10
# lvmd.lvcreateOptionClasses -- Specify the lvcreate-option-class settings.
lvcreateOptionClasses: []
#- name: localcache-hdd
# options:
# - --thin
# lvmd.args -- Arguments to be passed to the command.
args: []
# lvmd.priorityClassName -- Specify priorityClassName.
priorityClassName:
# lvmd.tolerations -- Specify tolerations.
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
# lvmd.nodeSelector -- Specify nodeSelector.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeSelector: {}
# lvmd.affinity -- Specify affinity.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}
# lvmd.volumes -- Specify volumes.
volumes: []
# - name: lvmd-socket-dir
# hostPath:
# path: /run/topolvm
# type: DirectoryOrCreate
# lvmd.volumeMounts -- Specify volumeMounts.
volumeMounts: []
# - name: lvmd-socket-dir
# mountPath: /run/topolvm
# lvmd.env -- extra environment variables
env: []
# - name: LVM_SYSTEM_DIR
# value: /tmp
# lvmd.additionalConfigs -- Define additional LVM Daemon configs if you have additional types of nodes.
# Please ensure nodeSelectors are non overlapping.
additionalConfigs: []
# - tolerations: []
# nodeSelector: {}
# device-classes:
# - name: ssd
# volume-group: myvg2
# default: true
# spare-gb: 10
# lvmd.updateStrategy -- Specify updateStrategy.
updateStrategy: {}
# type: RollingUpdate
# rollingUpdate:
# maxSurge: 50%
# maxUnavailable: 50%
# lvmd.podLabels -- Additional labels to be set on the lvmd service pods.
podLabels: {}
# lvmd.labels -- Additional labels to be added to the Daemonset.
labels: {}
# lvmd.initContainers -- Additional initContainers for the lvmd service.
initContainers: []
# CSI node service
node:
# node.lvmdEmbedded -- Specify whether to embed lvmd in the node container.
# Should not be used in conjunction with lvmd.managed otherwise lvmd will be started twice.
lvmdEmbedded: false
# node.lvmdSocket -- Specify the socket to be used for communication with lvmd.
lvmdSocket: /run/topolvm/lvmd.sock
# node.kubeletWorkDirectory -- Specify the work directory of Kubelet on the host.
# For example, on microk8s it needs to be set to `/var/snap/microk8s/common/var/lib/kubelet`
kubeletWorkDirectory: /var/lib/kubelet
# node.args -- Arguments to be passed to the command.
args: []
# node.securityContext. -- Container securityContext.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
securityContext:
privileged: true
metrics:
# node.metrics.enabled -- If true, enable scraping of metrics by Prometheus.
enabled: true
# node.metrics.annotations -- Annotations for Scrape used by Prometheus.
annotations:
prometheus.io/port: metrics
prometheus:
podMonitor:
# node.prometheus.podMonitor.enabled -- Set this to `true` to create PodMonitor for Prometheus operator.
enabled: false
# node.prometheus.podMonitor.additionalLabels -- Additional labels that can be used so PodMonitor will be discovered by Prometheus.
additionalLabels: {}
# node.prometheus.podMonitor.namespace -- Optional namespace in which to create PodMonitor.
namespace: ""
# node.prometheus.podMonitor.interval -- Scrape interval. If not set, the Prometheus default scrape interval is used.
interval: ""
# node.prometheus.podMonitor.scrapeTimeout -- Scrape timeout. If not set, the Prometheus default scrape timeout is used.
scrapeTimeout: ""
# node.prometheus.podMonitor.relabelings -- RelabelConfigs to apply to samples before scraping.
relabelings: []
# - sourceLabels: [__meta_kubernetes_service_label_cluster]
# targetLabel: cluster
# regex: (.*)
# replacement: ${1}
# action: replace
# node.prometheus.podMonitor.metricRelabelings -- MetricRelabelConfigs to apply to samples before ingestion.
metricRelabelings: []
# - sourceLabels: [__meta_kubernetes_service_label_cluster]
# targetLabel: cluster
# regex: (.*)
# replacement: ${1}
# action: replace
# node.priorityClassName -- Specify priorityClassName.
priorityClassName:
# node.tolerations -- Specify tolerations.
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
# node.nodeSelector -- Specify nodeSelector.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeSelector: {}
# node.affinity -- Specify affinity.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: {}
# node.volumes -- Specify volumes.
volumes: []
# - name: registration-dir
# hostPath:
# path: /var/lib/kubelet/plugins_registry/
# type: Directory
# - name: node-plugin-dir
# hostPath:
# path: /var/lib/kubelet/plugins/topolvm.io/node
# type: DirectoryOrCreate
# - name: csi-plugin-dir
# hostPath:
# path: /var/lib/kubelet/plugins/kubernetes.io/csi
# type: DirectoryOrCreate
# - name: pod-volumes-dir
# hostPath:
# path: /var/lib/kubelet/pods/
# type: DirectoryOrCreate
# - name: lvmd-socket-dir
# hostPath:
# path: /run/topolvm
# type: Directory
volumeMounts:
# node.volumeMounts.topolvmNode -- Specify volumes.
topolvmNode: []
# - name: node-plugin-dir
# mountPath: /var/lib/kubelet/plugins/topolvm.io/node
# - name: csi-plugin-dir
# mountPath: /var/lib/kubelet/plugins/kubernetes.io/csi
# mountPropagation: "Bidirectional"
# - name: pod-volumes-dir
# mountPath: /var/lib/kubelet/pods
# mountPropagation: "Bidirectional"
# - name: lvmd-socket-dir
# mountPath: /run/topolvm
# node.updateStrategy -- Specify updateStrategy.
updateStrategy: {}
# type: RollingUpdate
# rollingUpdate:
# maxSurge: 50%
# maxUnavailable: 50%
# node.podLabels -- Additional labels to be set on the node pods.
podLabels: {}
# node.labels -- Additional labels to be added to the Daemonset.
labels: {}
# node.initContainers -- Additional initContainers for the node service.
initContainers: []
# CSI controller service
controller:
# controller.replicaCount -- Number of replicas for CSI controller service.
replicaCount: 1
# controller.args -- Arguments to be passed to the command.
args: []
storageCapacityTracking:
# controller.storageCapacityTracking.enabled -- Enable Storage Capacity Tracking for csi-provisioner.
enabled: true
securityContext:
# controller.securityContext.enabled -- Enable securityContext.
enabled: true
nodeFinalize:
# controller.nodeFinalize.skipped -- Skip automatic cleanup of PhysicalVolumeClaims when a Node is deleted.
skipped: false
prometheus:
podMonitor:
# controller.prometheus.podMonitor.enabled -- Set this to `true` to create PodMonitor for Prometheus operator.
enabled: false
# controller.prometheus.podMonitor.additionalLabels -- Additional labels that can be used so PodMonitor will be discovered by Prometheus.
additionalLabels: {}
# controller.prometheus.podMonitor.namespace -- Optional namespace in which to create PodMonitor.
namespace: ""
# controller.prometheus.podMonitor.interval -- Scrape interval. If not set, the Prometheus default scrape interval is used.
interval: ""
# controller.prometheus.podMonitor.scrapeTimeout -- Scrape timeout. If not set, the Prometheus default scrape timeout is used.
scrapeTimeout: ""
# controller.prometheus.podMonitor.relabelings -- RelabelConfigs to apply to samples before scraping.
relabelings: []
# - sourceLabels: [__meta_kubernetes_service_label_cluster]
# targetLabel: cluster
# regex: (.*)
# replacement: ${1}
# action: replace
# controller.prometheus.podMonitor.metricRelabelings -- MetricRelabelConfigs to apply to samples before ingestion.
metricRelabelings: []
# - sourceLabels: [__meta_kubernetes_service_label_cluster]
# targetLabel: cluster
# regex: (.*)
# replacement: ${1}
# action: replace
# controller.terminationGracePeriodSeconds -- (int) Specify terminationGracePeriodSeconds.
terminationGracePeriodSeconds: # 10
# controller.priorityClassName -- Specify priorityClassName.
priorityClassName:
# controller.updateStrategy -- Specify updateStrategy.
updateStrategy: {}
# type: RollingUpdate
# rollingUpdate:
# maxSurge: 50%
# maxUnavailable: 50%
# controller.minReadySeconds -- (int) Specify minReadySeconds.
minReadySeconds: # 0
# controller.affinity -- Specify affinity.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
affinity: |
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/component
operator: In
values:
- controller
- key: app.kubernetes.io/name
operator: In
values:
- {{ include "topolvm.name" . }}
topologyKey: kubernetes.io/hostname
# controller.tolerations -- Specify tolerations.
## ref: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations:
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/control-plane
effect: NoSchedule
# controller.nodeSelector -- Specify nodeSelector.
## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
nodeSelector: {}
# controller.volumes -- Specify volumes.
volumes:
- name: socket-dir
emptyDir: {}
podDisruptionBudget:
# controller.podDisruptionBudget.enabled -- Specify podDisruptionBudget enabled.
enabled: true
# controller.podLabels -- Additional labels to be set on the controller pod.
podLabels: {}
# controller.labels -- Additional labels to be added to the Deployment.
labels: {}
# controller.initContainers -- Additional initContainers for the controller service.
initContainers: []
resources:
# resources.topolvm_node -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
topolvm_node: {}
# requests:
# memory: 100Mi
# cpu: 100m
# limits:
# memory: 500Mi
# cpu: 500m
# resources.csi_registrar -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
csi_registrar: {}
# requests:
# cpu: "25m"
# memory: "10Mi"
# limits:
# cpu: "200m"
# memory: "200Mi"
# resources.liveness_probe -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
liveness_probe: {}
# requests:
# cpu: "25m"
# memory: "10Mi"
# limits:
# cpu: "200m"
# memory: "200Mi"
# resources.topolvm_controller -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
topolvm_controller: {}
# requests:
# memory: "50Mi"
# cpu: "50m"
# limits:
# memory: "200Mi"
# cpu: "200m"
# resources.csi_provisioner -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
csi_provisioner: {}
# requests:
# memory: "50Mi"
# cpu: "50m"
# limits:
# memory: "200Mi"
# cpu: "200m"
# resources.csi_resizer -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
csi_resizer: {}
# requests:
# memory: "50Mi"
# cpu: "50m"
# limits:
# memory: "200Mi"
# cpu: "200m"
# resources.csi_snapshotter -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
csi_snapshotter: {}
# requests:
# memory: "50Mi"
# cpu: "50m"
# limits:
# memory: "200Mi"
# cpu: "200m"
# resources.lvmd -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
lvmd: {}
# requests:
# memory: 100Mi
# cpu: 100m
# limits:
# memory: 500Mi
# cpu: 500m
# resources.topolvm_scheduler -- Specify resources.
## ref: https://kubernetes.io/docs/user-guide/compute-resources/
topolvm_scheduler: {}
# requests:
# memory: "50Mi"
# cpu: "50m"
# limits:
# memory: "200Mi"
# cpu: "200m"
env:
# env.topolvm_node -- Specify environment variables for topolvm_node container.
topolvm_node: []
# env.csi_registrar -- Specify environment variables for csi_registrar container.
csi_registrar: []
# env.liveness_probe -- Specify environment variables for liveness_probe container.
liveness_probe: []
# env.topolvm_controller -- Specify environment variables for topolvm_controller container.
topolvm_controller: []
# env.csi_provisioner -- Specify environment variables for csi_provisioner container.
csi_provisioner: []
# env.csi_resizer -- Specify environment variables for csi_resizer container.
csi_resizer: []
# env.csi_snapshotter -- Specify environment variables for csi_snapshotter container.
csi_snapshotter: []
# To specify environment variables for lvmd, use lvmd.env instead.
# lvmd: []
# env.topolvm_scheduler -- Specify environment variables for topolvm_scheduler container.
topolvm_scheduler: []
livenessProbe:
# livenessProbe.topolvm_node -- Specify resources.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
topolvm_node:
failureThreshold:
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 60
# livenessProbe.csi_registrar -- Specify livenessProbe.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
csi_registrar:
failureThreshold:
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 60
# livenessProbe.topolvm_controller -- Specify livenessProbe.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
topolvm_controller:
failureThreshold:
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 60
# livenessProbe.lvmd -- Specify livenessProbe.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
lvmd:
failureThreshold:
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 60
# livenessProbe.topolvm_scheduler -- Specify livenessProbe.
## ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
topolvm_scheduler:
failureThreshold:
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 60
# storageClasses -- Whether to create storageclass(es)
# ref: https://kubernetes.io/docs/concepts/storage/storage-classes/
storageClasses:
- name: topolvm-provisioner # Defines name of storage class.
storageClass:
# Supported filesystems are: ext4, xfs, and btrfs.
fsType: ext4
# reclaimPolicy
reclaimPolicy: # Delete
# Additional annotations
annotations: {}
# Default storage class for dynamic volume provisioning
# ref: https://kubernetes.io/docs/concepts/storage/dynamic-provisioning
isDefaultClass: false
# volumeBindingMode can be either WaitForFirstConsumer or Immediate. WaitForFirstConsumer is recommended because TopoLVM cannot schedule pods wisely if volumeBindingMode is Immediate.
volumeBindingMode: WaitForFirstConsumer #Immediate
# enables CSI drivers to expand volumes. This feature is available for Kubernetes 1.16 and later releases.
allowVolumeExpansion: true
additionalParameters: {}
topolvm.io/device-class: "localcache-hdd"
webhook:
existingCertManagerIssuer: {}
# group: cert-manager.io/v1
# kind: ClusterIssuer
# name: letsencrypt-prod
podMutatingWebhook:
# webhook.podMutatingWebhook.enabled -- Enable Pod MutatingWebhook.
enabled: false
pvcMutatingWebhook:
# webhook.pvcMutatingWebhook.enabled -- Enable PVC MutatingWebhook.
enabled: true
# Container Security Context
# ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
securityContext:
# securityContext.runAsUser -- Specify runAsUser.
runAsUser: 10000
# securityContext.runAsGroup -- Specify runAsGroup.
runAsGroup: 10000
cert-manager:
# cert-manager.enabled -- Install cert-manager together.
## ref: https://cert-manager.io/docs/installation/kubernetes/#installing-with-helm
enabled: true
priorityClass:
# priorityClass.enabled -- Install priorityClass.
enabled: true
# priorityClass.name -- Specify priorityClass resource name.
name: topolvm
# priorityClass.value -- Specify priorityClass value.
value: 1000000
snapshot:
# snapshot.enabled -- Turn on the snapshot feature.
enabled: true
I unfortunately don't have control over disk provisioning, but I can create logical volumes with thin provisioning using the lvcreate --thin ... command. Therefore, I included it in the config like this:
deviceClasses:
- name: localcache-hdd
volume-group: vg_localcache
default: true
spare-gb: 10
lvcreate-options:
- --thin
I re-deployed TopoLVM and started from scratch.
- Created PVC
- Created a container (PVC couldn't be attached to the pod) :(
Description of PVC:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 3m27s (x8 over 3h12m) topolvm.io_topolvm-controller-5b446dc55d-ks9sd_820a7725-8a58-4089-a564-575cf399b91e failed to provision volume with StorageClass "topolvm-provisioner": rpc error: code = Internal desc = not found
Normal WaitForPodScheduled 2m33s (x811 over 3h12m) persistentvolume-controller waiting for pod dind to be scheduled
Warning ProvisioningFailed 117s (x57 over 3h12m) topolvm.io_topolvm-controller-5b446dc55d-ks9sd_820a7725-8a58-4089-a564-575cf399b91e failed to provision volume with StorageClass "topolvm-provisioner": rpc error: code = ResourceExhausted desc = no enough space left on VG: free=10670309376, requested=10737418240
Normal Provisioning 106s (x66 over 3h12m) topolvm.io_topolvm-controller-5b446dc55d-ks9sd_820a7725-8a58-4089-a564-575cf399b91e External provisioner is provisioning volume for claim "docker/localcache-pvc"
Log from topolvm-lvmd:
{"level":"info","ts":"2024-01-18T12:58:09Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:09Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:10Z","msg":"invoking LVM command","args":["lvcreate","-n","381debf8-d1e0-4cf9-acf5-c414d7af82d7","-L","10737418240b","-W","y","-y","--thin","vg_localcache"]}
{"level":"info","ts":"2024-01-18T12:58:10Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"error","ts":"2024-01-18T12:58:10Z","msg":"failed to create volume","name":"381debf8-d1e0-4cf9-acf5-c414d7af82d7","requested":10737418240,"tags":[],"error":"not found","stacktrace":"github.com/topolvm/topolvm/lvmd.(*lvService).CreateLV\n\t/workdir/lvmd/lvservice.go:122\ngithub.com/topolvm/topolvm/lvmd/proto._LVService_CreateLV_Handler\n\t/workdir/lvmd/proto/lvmd_grpc.pb.go:127\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1374\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1751\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:986"}
{"level":"info","ts":"2024-01-18T12:58:10Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:18Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"info","ts":"2024-01-18T12:58:18Z","msg":"invoking LVM command","args":["fullreport","--reportformat","json","--units","b","--nosuffix","--configreport","vg","-o","vg_name,vg_uuid,vg_size,vg_free","--configreport","lv","-o","lv_uuid,lv_name,lv_full_name,lv_path,lv_size,lv_kernel_major,lv_kernel_minor,origin,origin_size,pool_lv,lv_tags,lv_attr,vg_name,data_percent,metadata_percent,pool_lv","--configreport","pv","-o,","--configreport","pvseg","-o,","--configreport","seg","-o,"]}
{"level":"error","ts":"2024-01-18T12:58:18Z","msg":"not enough space left on VG","name":"e6c6985f-b781-493e-b818-780cb9afec57","free":10670309376,"requested":10737418240,"stacktrace":"github.com/topolvm/topolvm/lvmd.(*lvService).CreateLV\n\t/workdir/lvmd/lvservice.go:90\ngithub.com/topolvm/topolvm/lvmd/proto._LVService_CreateLV_Handler\n\t/workdir/lvmd/proto/lvmd_grpc.pb.go:127\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1374\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:1751\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.1\n\t/go/pkg/mod/google.golang.org/grpc@v1.58.3/server.go:986"}
And when I check lvs on the node, I see:
root@runner-test-wrk-1:~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
03421acb-f56f-4dc2-9ae5-f39debd478ee vg_localcache twi-a-tz-- 10.00g 0.00 10.61
6b846237-1867-411d-8450-c6123c36ac10 vg_localcache twi-a-tz-- 10.00g 0.00 10.61
7d46aed5-4df0-4eb1-92a0-55e03d6f6e01 vg_localcache twi-a-tz-- 10.00g 0.00 10.61
d9070b16-3d09-4073-a2b5-eeb95e038bee vg_localcache twi-a-tz-- 10.00g 0.00 10.61
I can't figure out why TopoLVM is creating multiple logical volumes. What exactly is going wrong here?
Hi Koray! the reason you are seeing this is because of your LVMD config:
deviceClasses:
- name: localcache-hdd
volume-group: vg_localcache
default: true
spare-gb: 10
lvcreate-options:
- --thin
is not the right way to initiate a thin pool.
deviceClasses:
- name: localcache-hdd
volume-group: vg_localcache
default: true
spare-gb: 10
type: thin
thin-pool:
name: thin-pool-1
overprovision-ratio: 2 //arbitrary
@daichimukai maybe it makes sense to validate that --thin
is not part of lvcreate-options and throw an error instead?
@koraypinarci Hi. I'm not sure why the multiple LVs were created in your settings, but could you try the following steps to create a snapshot?
- Before deploying TopoLVM, run
lvcreate --thin ...
command on your own. TopoLVM doesn't create a thin pool (ref.), so you need to create one. See also the corresponding code in our e2e test. - Modify your values.yaml to define a device class for the thin pool. You need to specify
type: thin
andthin-pool
field. See our e2e test. - Define a new StorageClass using the device class you defined in the previous step. Use
additionalParameters
here. See also here. - Apply a PVC using the the StorageClass you defined, bind a Pod to the PVC, and take a snapshot of it.
@jakobmoellerdev I think lvcreate-options
is a kind of escape hatch for users and is expected not to be validated at all, as the document says. Instead, we may need more docs for snapshots like the one above.
@ushitora-anqou the code in topolvm does not work properly for thin pools referenced with lvcreate because all of the Thin pool checks are based on the property in lvmd, not on the lvcreate option. I think this is the reason for the misbehaving codeline. The test that you linked only shows the vg/lv creation but the lvmd associated with it shows that we need this different format:
Lines 20 to 23 in f9c44db
I do agree that this can be used as "escape hatch" for RAID etc. but not for thin provisioning since its already part of TopoLVMs API
I agree that lvcreate-options
is not intended for thin provisioning, and the properties of lvmd should be edited to use it.
However, the idea of validating lvcreate-options
doesn't seem to be an appropriate option here. My concerns are as follows:
- The TopoLVM documentation already says that
lvcreate-options
should be used at users' own risk. I'd like to keep it that way. - Validating
lvcreate-options
in a sound and complete way seems very difficult to me. That is, I want to avoid annoying situations where TopoLVMlvcreate-options
doesn't support some strings thatlvcreate
does, or vice versa. For example, naive string matching for--thin
and-T
won't work if a complex command-line argument is used such as-vT
. To avoid such situations, we need to parse the command-line options passed throughlvcreate-options
in exactly the same way aslvcreate
. But this will be difficult becauselvcreate
uses getopt(3) to parse the arguments, but we can't use it because we're writing Go (without cgo).
It seems to me that this issue (#827) was raised because TopoLVM currently lacks documentation for thin provisioning and snapshots. So, I'd like to suggest adding a short step-by-step tutorial that will be sufficient for users to start using snapshots.
Hello everyone,
Thank you for your prompt response.
It took me some time to implement and test the configuration you recommended. I'm happy to report that TopoLVM is performing exactly as I envisioned. Thank you once again for your support. I'd like to share my configuration here in case someone else encounters the same issue, to help them find a solution more quickly. Therefore, some aspects might be repetitive.
Here's what I've changed in my setup:
- As @jakobmoellerdev suggested, I adjusted my deviceClasses:
deviceClasses:
- name: localcache-hdd-thin
volume-group: vg_localcache
default: true
spare-gb: 5
type: thin
thin-pool:
name: thin-pool-1
overprovision-ratio: 2
- Then, I created a new thin pool on the node. @ushitora-anqou, thank you for the tip.
List diks
lsblk
First step create Phisical Volume
pvcreate /dev/sdb
Second step create Volume Groups
vgcreate vg_localcache /dev/sdb
Create thin-pool
lvcreate -T -n thin-pool-1 -L 45G vg_localcache
- I adjusted my StorageClass
storageClasses:
- name: topolvm-provisioner # Defines name of storage class.
storageClass:
fsType: ext4
reclaimPolicy: # Delete
annotations: {}
isDefaultClass: false
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
additionalParameters: {}
topolvm.io/device-class: "localcache-hdd-thin"
- name: topolvm-provisioner-thin
storageClass:
fsType: xfs
isDefaultClass: true
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
additionalParameters:
'{{ include "topolvm.pluginName" . }}/device-class': "localcache-hdd-thin"
- Redeployed TopoLVM with Helm
#!/bin/bash
set -euo pipefail
TOPVLM_CONTEXT=ghrunner-test
TOPOLVM_NAMESPACE=topolvm-system
TOPPLVM_INSTALLATION_NAME=topolvm
TOPOLVM_VALUES_FILE=/topolvm/kube-system/values.yaml
KUBECONFIG_PATH=~/.kube/config.yaml
TOPOLVM_CHART_VERION=13.0.1
# Change context
kubectl config use-context "${TOPVLM_CONTEXT}"
# Add helm repo for topolvm
helm repo add topolvm https://topolvm.github.io/topolvm \
&& helm repo update
# Crate namespace
kubectl --kubeconfig "${KUBECONFIG_PATH}" create namespace "${TOPOLVM_NAMESPACE}" || true
kubectl --kubeconfig "${KUBECONFIG_PATH}" label namespace "${TOPOLVM_NAMESPACE}" topolvm.io/webhook=ignore \
&& kubectl --kubeconfig "${KUBECONFIG_PATH}" create namespace kube-system || true
kubectl --kubeconfig "${KUBECONFIG_PATH}" label namespace kube-system topolvm.io/webhook=ignore
# Install topoLVM with the release name
helm --kubeconfig "${KUBECONFIG_PATH}" upgrade --install "${TOPPLVM_INSTALLATION_NAME}" \
--namespace "${TOPOLVM_NAMESPACE}" \
--create-namespace \
--values "${TOPOLVM_VALUES_FILE}" \
--version "${TOPOLVM_CHART_VERION}" \
--debug \
topolvm/topolvm
# Check if topoLVM is running. All topoLVM pods should be running.
kubectl --kubeconfig "${KUBECONFIG_PATH}" get pod --namespace "${TOPOLVM_NAMESPACE}"
- This is a simple script that saves me the effort of deploying resources individually.
#!/bin/bash
set -euo pipefail
# Configuration variables
TOPVLM_CONTEXT=ghrunner-test
KUBECONFIG_PATH=~/.kube/config.yaml
CONFI_DIR=/topolvm/custom
TOPOLVM_NAMESPACE=default
# Function to apply Kubernetes configurations
apply_config() {
echo "🚀 Applying $1..."
kubectl --kubeconfig "${KUBECONFIG_PATH}" apply -f "${CONFI_DIR}/$1"
sleep 10
}
# Function to delete Kubernetes configurations
delete_config() {
echo "💀 Deleting $1..."
kubectl --kubeconfig "${KUBECONFIG_PATH}" delete -f "${CONFI_DIR}/$1"
sleep 10
}
# Create PVC
apply_config "pvc.yaml"
# Create initial Pod
apply_config "dind-init-pod.yaml"
# Pull Docker images
echo "🐋 Pulling images"
kubectl --kubeconfig "${KUBECONFIG_PATH}" exec -n "${TOPOLVM_NAMESPACE}" dind -- docker pull ubuntu
sleep 10
# Create SnapshotClass
apply_config "snapshotclass.yaml"
# Remove initial Pod
delete_config "dind-init-pod.yaml"
# Create Snapshot
apply_config "snapshot.yaml"
# Remove original PVC
delete_config "pvc.yaml"
# Start second Pod and bind PVC
apply_config "dind-2-pod.yaml"
# Start third Pod
apply_config "dind-3-pod.yaml"
With this, I was able to create a snapshot from a PVC and use the snapshot to create new PVCs and bind them to my two pods.
This was exactly what I wanted to do. Thank you very much.
I'm glad to hear that! I've opened another PR #833 to track an additional document for Snapshot&Restore, so I'll close this issue.