minio / minio

The Object Store for AI Data Infrastructure

Home Page:https://min.io/download

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Server init fail on minikube when multiple nodes are set

jordantgh opened this issue · comments

I am trying to follow the k8s helm-based deployment guide, using minikube with multiple nodes, and I'm running into (what appear to be) drive permissions issues which prevent the servers from initialising. I have no such problems with a single node minikube. I have some very simple reprex code below, if you wouldn't mind taking a look.

Expected Behavior

Initialise the minio tenant servers and succesfully make a test bucket, like I can with a single node minikube (i.e., initiated with minikube start), single server tenant (tenant.pools.servers=1).

Current Behavior

Servers do not initialise succesfully and I cannot make a test bucket using minikube with multiple nodes (i.e., minikube initiated with --nodes 4 argument, and 4 servers set with tenant.pools.servers=4).

Possible Solution

A janky workaround is to specify local persistent volumes for each node and make them manually with minikube ssh.

Steps to Reproduce

minikube start --nodes 4

helm repo add minio-operator https://operator.min.io
helm repo update

helm install \
  --namespace minio-operator \
  --create-namespace \
  operator minio-operator/operator

kubectl get all -n minio-operator
mkdir -p minio-tenant-charts && cd minio-tenant-charts
curl -O https://raw.githubusercontent.com/minio/operator/master/helm-releases/tenant-5.0.14.tgz

helm install \
--namespace miniotenant-testing \
--create-namespace \
-f ../values-override.yaml \
myminio tenant-5.0.14.tgz

kubectl port-forward svc/myminio-hl 9000 -n miniotenant-testing
mc alias set myminio https://localhost:9000 minio minio123 --insecure
mc mb myminio/mybucket --insecure

values-override.yaml:

tenant:
  pools:
    - servers: 4
      name: pool-0
      volumesPerServer: 1
      size: 2Gi

Context

Trying to set up/learn distributed storage over k8s. Minikube is an easy and free way to start, otherwise I am just following minio docs.

Your Environment

  • Version used:
    quay.io/minio/minio:RELEASE.2024-03-15T01-07-19Z

  • Operating System and version (uname -a):
    Linux jt-home 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

I forgot, I also wanted to post these logs:

>> kubectl logs myminio-pool-0-0 -n miniotenant-testing | head -n 15
Defaulted container "minio" out of: minio, sidecar, validate-arguments (init)

API: SYSTEM()
Time: 19:16:52 UTC 05/07/2024
Error: Drive https://myminio-pool-0-0.myminio-hl.miniotenant-testing.svc.cluster.local:9000/export/data returned an unexpected error: file access denied, please investigate - drive will be offline (*fmt.wrapError)
      10: internal/logger/logonce.go:118:logger.(*logOnceType).logOnceIf()
       9: internal/logger/logonce.go:149:logger.LogOnceIf()
       8: cmd/storage-rest-server.go:1195:cmd.logFatalErrs()
       7: cmd/storage-rest-server.go:1336:cmd.registerStorageRESTHandlers.func2()
       6: cmd/storage-rest-server.go:1352:cmd.registerStorageRESTHandlers()
       5: cmd/routers.go:30:cmd.registerDistErasureRouters()
       4: cmd/routers.go:83:cmd.configureServerHandler()
       3: cmd/server-main.go:759:cmd.serverMain.func8()
       2: cmd/server-main.go:495:cmd.bootstrapTrace()
       1: cmd/server-main.go:758:cmd.serverMain()
>> kubectl logs myminio-pool-0-0 -n miniotenant-testing | tail -n 30
Defaulted container "minio" out of: minio, sidecar, validate-arguments (init)
       2: cmd/server-main.go:495:cmd.bootstrapTrace()
       1: cmd/server-main.go:806:cmd.serverMain()
Waiting for a minimum of 2 drives to come online (elapsed 39m3s)


API: SYSTEM()
Time: 19:56:26 UTC 05/07/2024
Error: Drive https://myminio-pool-0-3.myminio-hl.miniotenant-testing.svc.cluster.local:9000/export/data returned an unexpected error: file access denied, please investigate - drive will be offline (*fmt.wrapError)
       5: internal/logger/logonce.go:118:logger.(*logOnceType).logOnceIf()
       4: internal/logger/logonce.go:149:logger.LogOnceIf()
       3: cmd/storage-rest-server.go:1195:cmd.logFatalErrs()
       2: cmd/storage-rest-server.go:1336:cmd.registerStorageRESTHandlers.func2()
       1: cmd/storage-rest-server.go:1360:cmd.registerStorageRESTHandlers.func3()
Unable to use the drive https://myminio-pool-0-0.myminio-hl.miniotenant-testing.svc.cluster.local:9000/export/data: file access denied

API: SYSTEM()
Time: 19:56:26 UTC 05/07/2024
Error: Read failed. Insufficient number of drives online (*errors.errorString)
      10: internal/logger/logger.go:260:logger.LogIf()
       9: cmd/prepare-storage.go:243:cmd.connectLoadInitFormats()
       8: cmd/prepare-storage.go:304:cmd.waitForFormatErasure()
       7: cmd/erasure-server-pool.go:129:cmd.newErasureServerPools.func1()
       6: cmd/server-main.go:495:cmd.bootstrapTrace()
       5: cmd/erasure-server-pool.go:128:cmd.newErasureServerPools()
       4: cmd/server-main.go:1059:cmd.newObjectLayer()
       3: cmd/server-main.go:808:cmd.serverMain.func10()
       2: cmd/server-main.go:495:cmd.bootstrapTrace()
       1: cmd/server-main.go:806:cmd.serverMain()
Waiting for a minimum of 2 drives to come online (elapsed 39m4s)

It looks like the user running the process and the backend drives have permission issues.

Hey @harshavardhana . I agree this seems likely! However, since this issue does not occur with a single node cluster it presumably has to do with inter-node communication, for example, minio pod in node A is trying to access drive in node B. If that is the problem, great to narrow it down, but would you be able to offer any suggestions for resolving the issue? If you feel this is more of a minikube issue, that's cool, though I'd still love any thoughts you may have.

Hey @harshavardhana . I agree this seems likely! However, since this issue does not occur with a single node cluster it presumably has to do with inter-node communication, for example, minio pod in node A is trying to access drive in node B. If that is the problem, great to narrow it down, but would you be able to offer any suggestions for resolving the issue? If you feel this is more of a minikube issue, that's cool, though I'd still love any thoughts you may have.

file access denied is generated by the drives not by the network @jordantgh

Do you have any guess as to why this would occur specifically and only when there are multiple nodes/servers?

@jordantgh it looks like a local problem for you

I am able to successfully use minikube with 4 nodes operator v5.0.15

mc admin info myminio/ --insecure
Handling connection for 9000
●  myminio-pool-0-0.myminio-hl.miniotenant-testing.svc.cluster.local:9000
   Uptime: 1 minute 
   Version: 2024-05-01T01:11:10Z
   Network: 4/4 OK 
   Drives: 1/1 OK 
   Pool: 1

●  myminio-pool-0-1.myminio-hl.miniotenant-testing.svc.cluster.local:9000
   Uptime: 1 minute 
   Version: 2024-05-01T01:11:10Z
   Network: 4/4 OK 
   Drives: 1/1 OK 
   Pool: 1

●  myminio-pool-0-2.myminio-hl.miniotenant-testing.svc.cluster.local:9000
   Uptime: 1 minute 
   Version: 2024-05-01T01:11:10Z
   Network: 4/4 OK 
   Drives: 1/1 OK 
   Pool: 1

●  myminio-pool-0-3.myminio-hl.miniotenant-testing.svc.cluster.local:9000
   Uptime: 2 minutes 
   Version: 2024-05-01T01:11:10Z
   Network: 4/4 OK 
   Drives: 1/1 OK 
   Pool: 1

Pools:
   1st, Erasure sets: 1, Drives per erasure set: 4

0 B Used, 1 Bucket, 0 Objects
4 drives online, 0 drives offline, EC:2
minikube addons disable storage-provisioner
minikube addons disable default-storageclass
minikube addons enable volumesnapshots
minikube addons enable csi-hostpath-driver
kubectl patch storageclass csi-hostpath-sc -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
kubectl delete storageclass standard

Thanks, this solved the issue for me.