openebs / mayastor

Dynamically provision Stateful Persistent Replicated Cluster-wide Fabric Volumes & Filesystems for Kubernetes that is provisioned from an optimized NVME SPDK backend data storage stack.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot create `VolumeSnapshot` with mayastor v2.4.0

synthe102 opened this issue · comments

Describe the bug
With Mayastor v2.4.0, I can't manage to successfully create a VolumeSnapshot, it nevers enters a ready state.
There is an error message, on the VolumeSnapshot and on the VolumeSnapshotContent:
Failed to check and update snapshot content: failed to take snapshot of the volume 7932c417-8f77-47cd-bdb5-1f0dd13f92ea: "rpc error: code = Internal desc = Operation failed: GenericOperation(504, \"error in response: status code ''504 Gateway Timeout'', content: ''RestJsonError { details: \\\"\\\", message:\\\"SvcError :: SnapshotMaxTransactions: Reached maximum transactions for snapshot:26a478aa-d3c7-47ce-98b1-5f60a84cf84f, needs to be reconciled\\\",kind:DeadlineExceeded}''\")"

To Reproduce
Steps to reproduce the behavior:

  • Follow the getting started guide
  • Create a single replica StorageClass
  • Create a VolumeSnapshotClass
  • Create a PVC
  • Create a VolumeSnapshot

Expected behavior
I would expect the VolumeSnapshot to enter a ready state after some time.

** OS info (please complete the following information):**

  • Distro: Talos Linux
  • Kernel version: 6.1.61
  • MayaStor revision or container image: v2.4.0

Additional context
Add any other context about the problem here.

Seems like something keeps failing :/

Would you be able to attach a support bundle here?

kubectl mayastor dump system -n mayastor -d <output_directory_path>

https://mayastor.gitbook.io/introduction/advanced-operations/supportability#using-the-supportability-tool
You can download the binaries from here: https://github.com/openebs/mayastor-extensions/releases/tag/v2.4.0

Thanks for the quick answer !
Here is the support bundle.

mayastor-2023-11-22--17-44-00-UTC.tar.gz

hmm I'm a bit puzzled:

�[2m2023-11-22T08:55:08.445534Z�[0m �[31mERROR�[0m �[1;31mcore::volume::service�[0m�[31m: �[1;31merror�[0m�[31m: gRPC request 'create_snapshot' for 'Nexus' failed with 'status: Unimplemented, message: "", details: [], metadata: MetadataMap { headers: {"date": "Wed, 22 Nov 2023 08:55:08 GMT", "content-type": "application/grpc", "content-length": "0"} }'�[0m

Any chance you're running an older version of the dataplane (mayastor-io-engine) ? Sadly seems we have nothing on the SB to tell us that, we should improve the SB here, will take note.
Meanwhile would you be able to restart the mayastor-io-engine pod? Then if still not working, take another SB? Thank you

I'd like to apologize, I don't know why but my IO Engine was running v2.1.0. I restarted the pod, it picked up the v2.4.0 image and everything is working just fine.
Thanks for your help, and sorry for bothering you.

No bother at all, and we should probably handle this better anyway to make it easier to spot!
Thanks!