kube-image-keeper-registry-0 goes into a crash back-off when using AWS EBS persistent storage.
mccullough-ea opened this issue · comments
Sometimes when the kube-image-keeper-registry-0 pod is restarted it goes into a crash back-off with something like this at the end of its logs
garbage-collector docker.elastic.co/beats/filebeat: marking blob sha256:89732bc7504122601f40269fc9ddfb70982e633ea9caf641ae45736f2846b004 │
garbage-collector docker.io/jgraph/drawio │
garbage-collector manifest eligible for deletion: sha256:fb2a84c7a2e04d4ea2e5aa0c57385e0e61dd3c7c5ea559a09d5a3a2cca6de28f
I haven't found any errors in the logs, but it always ends with "garbage-collector manifest eligible for deletion"
My workaround is to delete the PVC and then restart the pod again. So there must be something on the volume that breaks it.
We are deploying it using the kube-image-keeper helm chart from https://charts.enix.io/ v1.4.0
It is hosted in EKS with k8s version 1.24. And uses the following stroage class:
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "false"
name: encrypted-ebs
parameters:
csi.storage.k8s.io/fstype: ext4
encrypted: "true"
type: gp3
provisioner: ebs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
Any help would be greatly appreciated
Hello @mccullough-ea , can you please test again with our last beta release (1.5.0-beta.1, published yesterday) OR update the deployment and set registry.persistence.deleteUntagged helm value to false ?
Thanks for the suggestion, 1.50-beta.1 seems have fixed it! I'll keep testing just in case I've got lucky..
@Nicolasgouze Thanks again for the suggestion, I can't seem to break it any more no matter how hard I try! Closing issue..