kubernetes-csi / csi-driver-smb

This driver allows Kubernetes to access SMB Server on both Linux and Windows nodes.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Slow performance after upgrade node os from Ubuntu 18.04 to Ubuntu 22.04

Ismael-Pep opened this issue · comments

What happened:
After updating our kubernetes (on-cloud) cluster from 1.24 to 1.25 We have experience errors after saving and reading a file from outside the cluster (on-prem).

Our current flow is:

  1. A file is saved from inside the container into a mounted point associated to a PVC (claiming from a PV associated with a samba share On-prem).
  2. A procedure is executed in a database that reads from that samba share and store the content of the file in the database.

After the upgrade to 1.25, the process started to fail. The database returned an error saying that other process was using the file when trying to read it. If we add a delay between the saving of the file and the call to the database procedure the process start´s working again. The process is unstable with a 0.5 seg delay but with a 1 seg it works fine.

We have 2 environments and both starting to fail after the upgrade.

From the point of view of the pod, right after saving a file, the file is complete and can be read without any problems
but from the point of view of the database the file is been written. It seems like after the upgrade the replication between the PV and the samba Share is slow?.

Initially we were using the 1.0 version of the drive but after seen nothing work we decided to upgrade to the version 1.10 with the same result.

What you expected to happen:
To work after the kubernetes upgrade.

How to reproduce it:
Create a file inside a mounted PVC and try Access it from outside the cluster right after.

Anything else we need to know?:
The samba share and the database and On-prem and the cluster is on Azure AKS.
We´ve test the application that saves the file on localhost with no problems.

Environment:
Note: running "kubectl get po -n kube-system -o yaml | grep gcr | grep smb" returns nothing. From Helm Charts and the plugin image The driver version in 1.10

  • CSI Driver version:1.10
  • Kubernetes version: v1.25.6
  • OS : linux (amd64)
  • Kernel: 5.15.0-1035-azure
  • Install tools: Helm
  • OS Image Ubuntu 22.04.2 LTS
  • Container Runtime: containerd://1.6.18+azure-1
  • Provider: Azure AKS

Thx!

There is a known smb driver issue on Ubuntu 22.04 with kernel 5.15 which could cause 30s delay when syncing data right after writing data, you could mitigate the issue by setting acregmax=0,acdirmax=30 and remove actimeo=30 from mount options in pv or storage class, and then restart pods to make it work.

Hi, thank you so much, this actually resolved our issue after a week of debugging.
For anyone who arrives here, we had to delete and create the storageclass and the pvc to apply the changes.