kubernetes-csi / csi-driver-smb

This driver allows Kubernetes to access SMB Server on both Linux and Windows nodes.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sporadic failure due to Workstation service not started on Windows, service had been started all along

doctorpangloss opened this issue · comments

What happened:
Pod stuck in creation phase with:

NewSmbGlobalMapping failed. output: "New-SmbGlobalMapping : The Workstation service has not been started. \r\nAt line:1 char:190\r\n+ ... ser, $PWord;New-SmbGlobalMapping -RemotePath $Env:smbremotepath -Cred ...\r\n+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n    + CategoryInfo          : NotSpecified: (MSFT_SmbGlobalMapping:ROOT/Microsoft/...mbGlobalMapping) [New-SmbGlobalMa \r\n   pping], CimException\r\n    + FullyQualifiedErrorId : Windows System Error 2138,New-SmbGlobalMapping\r\n \r\n", err: exit status 1

SSH-ing into the node showed the service had started.

nssm restart LanmanWorkstation nonetheless resolved the issue.

What you expected to happen:

If this error appears, the driver should restart the service.

How to reproduce it:
Use the Feb 2024 Windows 2022 release and the latest 1.14 csi-driver-smb, perhaps... It's hard to say why this occurs.

FWIW the daemonset was failing to start in about 1 in 3 Windows nodes with 1.13 and the latest Windows patches.

Anything else we need to know?:

Environment:

  • CSI Driver version:
$ kubectl get po -n kube-system -o yaml | grep registry.k8s | grep smb
      image: registry.k8s.io/sig-storage/smbplugin:v1.14.0
      image: registry.k8s.io/sig-storage/smbplugin:v1.14.0
      imageID: registry.k8s.io/sig-storage/smbplugin@sha256:4e97e6f8c122c87253c89fce466e760f88122aa4a7b21677fad4c603144cc0dd
  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.1", GitCommit:"8f94681cd294aa8cfd3407b8191f6c70214973a4", GitTreeState:"clean", BuildDate:"2023-01-18T15:58:16Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"windows/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.2+k0s", GitCommit:"fc04e732bb3e7198d2fa44efa5457c7c6f8c0f5b", GitTreeState:"clean", BuildDate:"2023-03-02T19:21:48Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
  • OS (e.g. from /etc/os-release): Windows 2022
  • Kernel (e.g. uname -a): Windows_NT AppMana-Hostname-XXX 10.0 20348 x86_64 MS/Windows

I reverted from Feb 2024 Windows 2022 because it also breaks Calico. Perhaps these are related.

Cropped up again:

  Warning  FailedMount  2m (x10 over 8m21s)  kubelet            MountVolume.MountDevice failed for volume "pvc-365af285-dd45-4376-86a8-64fa28c78f49" : rpc error: code = Internal desc = volume(appmana-017-ds.i.appmana.com/appmana-cluster-03#pvc-365af285-dd45-4376-86a8-64fa28c78f49#) mount "//appmana-017-ds.i.appmana.com/appmana-cluster-03/pvc-365af285-dd45-4376-86a8-64fa28c78f49" on "\\var\\lib\\kubelet\\plugins\\kubernetes.io\\csi\\smb.csi.k8s.io\\1d50cf8e540807f9541f7d297b25d00fdb89a19cc7d7ede4e23fc87f89034f5c\\globalmount" failed with NewSmbGlobalMapping(\\appmana-017-ds.i.appmana.com\appmana-cluster-03\pvc-365af285-dd45-4376-86a8-64fa28c78f49, c:\var\lib\kubelet\plugins\kubernetes.io\csi\smb.csi.k8s.io\1d50cf8e540807f9541f7d297b25d00fdb89a19cc7d7ede4e23fc87f89034f5c\globalmount) failed with error: rpc error: code = Unknown desc = NewSmbGlobalMapping failed. output: "New-SmbGlobalMapping : The Workstation service has not been started. \r\nAt line:1 char:190\r\n+ ... ser, $PWord;New-SmbGlobalMapping -RemotePath $Env:smbremotepath -Cred ...\r\n+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\r\n    + CategoryInfo          : NotSpecified: (MSFT_SmbGlobalMapping:ROOT/Microsoft/...mbGlobalMapping) [New-SmbGlobalMa \r\n   pping], CimException\r\n    + FullyQualifiedErrorId : Windows System Error 2138,New-SmbGlobalMapping\r\n \r\n", err: exit status 1
  Warning  FailedMount  110s                 kubelet            Unable to attach or mount volumes: unmounted volumes=[comfyui-volume], unattached volumes=[kube-api-access-md2wm comfyui-volume workdir-volume]: timed out waiting for the condition