freegroup / kube-s3

Kubernetes pods used shared S3 storage

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pod restart causes error while creating mount source path

guysoft opened this issue · comments

I have cases where the s3 provider pod dies, and tries to restart, when that happends, it fails to restart and I get in the describe pod:

  Warning  Failed     7s (x4 over 47s)  kubelet, ip-192-168-7-42.eu-west-1.compute.internal  Error: failed to start container "s3fuse": Error response from daemon: error while creating mount source path '/mnt/data-s3fs-shapedo-tools-eu': mkdir /mnt/data-s3fs: file exists
  Warning  BackOff    6s (x3 over 21s)  kubelet, ip-192-168-7-42.eu-west-1.compute.internal  Back-off restarting failed container

In the node I can see that the folder is mounted with "Transport not connected"

If I unmount the folder the pod runs normally.
Using EKS 1.15

thanks for your bug report. The main issue is, that the "preStop" hook is missing in the Daemonset.yaml

Ok, I saw you also pushed new stuff just now, will go over it and see if it solves the issue here.

Ok, that fixes it!

However there is another issue I can see now on the example pod.

The issue is that if you set the host share to the same path as the mountpoint oh the host, if the host remounts the fuse volume, the example pod looses the folder link and gets Transport endpoint is not connected

The workaround I found was to mount on the example pod a folder up the tree, and then that folder remains static and does not get a different mapping if the fuse gets remounted.

Should I open a new issue for that and close this one, or should it be discussed here?

new issue would be fine.... + ....pull request :-)

Hey,
So I am getting again this error.
On the node I see:

  Warning  Failed   6m30s (x4 over 7m48s)   kubelet, ip-192-168-7-154.eu-west-1.compute.internal  Error: failed to start container "s3fuse-shapedo-website-eu": Error response from daemon: error while creating mount source path /mnt/data-s3-fs': mkdir /mnt/data-s3-fs: file exists

I also see on the node these events, not sure if related:

  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Sun, 28 Feb 2021 17:40:00 +0200   Mon, 15 Feb 2021 22:06:57 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Sun, 28 Feb 2021 17:40:00 +0200   Mon, 15 Feb 2021 22:06:57 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Sun, 28 Feb 2021 17:40:00 +0200   Mon, 15 Feb 2021 22:06:57 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Sun, 28 Feb 2021 17:40:00 +0200   Mon, 15 Feb 2021 22:07:17 +0200   KubeletReady                 kubelet is posting ready status

Should I open a new issue?