rexray / rexray

REX-Ray is a container storage orchestration engine enabling persistence for cloud native workloads

Home Page:http://rexray.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

bug: EBS multi-attach io2 not possible (eu-central-1)

sgohl opened this issue · comments

commented

Summary

  • 2 EC2-machines á m5
  • 1 volume io2 in eu-central-1b, 200GB iops 3000, multi-attach enabled

Steps to reproduce

install rexray/ebs:latest or rexray/ebs:edge and create docker volume via

docker run -it --rm --volume-driver=rexray/ebs:edge -v data-p1:/data alpine sh

/ # touch /data/hey
/ # ls -al /data/
total 4
drwx------    2 root     root          4096 Feb  3 21:27 .
drwxr-xr-x    1 root     root            30 Feb  3 21:27 ..
-rw-r--r--    1 root     root             0 Feb  3 21:27 hey

exit container, modify volume in ec2-dashboard to io2/multi-attach, wait until optimization done.

on m5 node1, do:

[ec2-user@node1 ~]$ docker run -it --rm --volume-driver=rexray/ebs:edge -v data-p1:/data alpine sh
/ # ls -al /data/
total 4
drwx------    2 root     root          4096 Feb  3 21:27 .
drwxr-xr-x    1 root     root            30 Feb  3 21:33 ..
-rw-r--r--    1 root     root             0 Feb  3 21:27 hey

this works as expected. Now, attach the same volume on m5 node2:

[ec2-user@node2 ~]$ docker run -it --rm --volume-driver=rexray/ebs -v data-p1:/data alpine sh
docker: Error response from daemon: error while mounting volume '': VolumeDriver.Mount: docker-legacy: Mount: data-p1: failed: resource not found.
ERRO[0000] error waiting for container: context canceled 

After manually detaching the volume from node1, I can successfully attach to node2, vice versa.
Manually attaching the volume to both nodes and classic mounting via nvme1n1 works

Version

# docker plugin inspect rexray/ebs:edge

            "Description": "REX-Ray for Amazon EBS",
            "DockerVersion": "18.06.1-ce",
            "Documentation": "https://github.com/thecodeteam/rexray/.docker/plugins/ebs",

AWS Linux 2
Linux node1 4.14.214-160.339.amzn2.x86_64 #1 SMP Sun Jan 10 05:53:05 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

docker version
Client:
 Version:           19.03.13-ce

@sgohl did you ever find a workaround for this issue?

commented

@markgllin unfortunately not and still interested in, too. As of your writing I guess the bug still exists. I was using NFS - although much slower - ever since and didn't come back to this until now.

I dont think this will ever be working but it's too pity to just close the issue

I feel like there's a bunch of reasons why this is probably not a desirable approach in the first place.

  • Test case is invalid; docker run -it --rm --volume-driver=rexray/ebs:edge -v data-p1:/data alpine sh will create an ext4 filesystem, which is not compatible with multi-attach. You would need to use GFS2 or OCFS2.
  • For the GFS2 case, see https://aws.amazon.com/blogs/storage/clustered-storage-simplified-gfs2-on-amazon-ebs-multi-attach-enabled-volumes/ which explains the setup at the host level. Of particular note is the reliance on an I/O fencing configuration that will terminate an entire EC2 node if the cluster loses quorum. That might pose some reliability concerns for any other random Docker containers that happen to live on the same host.
  • Clustered filesystems such as GFS2/OCFS2 are complex, highly specialized, and tend to be used for single high-volume workloads that will not fit on a single instance of any size and are known to work with that filesystem, e.g. Oracle databases. Setup looks like it would be a bit manual and static, at least until there's a way to automate everything, and that doesn't seem to map well to the K8S/ECS/Docker Swarm philosophy of the anonymous hive of worker nodes that you can binpack tasks onto.
  • A lot of cluster software needs to be installed at the host level and the plugin would have to coordinate with it in a vendor-specific manner. Might as well just use bind mounts at that point, so I'm not sure what's to be gained from having a Docker plugin handle the final mount step.
commented

will create an ext4 filesystem, which is not compatible with multi-attach. You would need to use GFS2 or OCFS2

oh sure, I absolutely didn't think about filesystem in that context. Thanks for pointing out.

I'm not sure what's to be gained from having a Docker plugin handle the final mount step.

right, I agree with that conclusion

From my perspective, flocker was the closest we ever got to have moving volumes around hosts, at least on open source.

GlusterFS is horrible, it always randomly lost the local endpoint mount in my case.
DRBD is fine, while requiring a bit of concentration when rescuing a failed sync, though.

Portworx from PureStorage is a very fine system, if money ain't an issue. The Docker Swarm compatibility is rather outdated, though. Also it's very complex to integrate.

Someone should just invent a blk-based-like storage kernel module which just works via TCP instead of SATA
and as easy to implement as S3, but uses the local filesystem and sync/replicate, rather than working on the remote storage directly, like zfs replication but easier and automatically

OK off topic, but regarding Portworx: I had a second look at it recently when the rexray/ebs plugin started to need a patch or two.

Not my style; I don't understand why the extra abstraction layer when EBS seems to handle what's required already. The conservative engineer in me says "this is not simple enough, it has too many moving parts."

So for those of us who prefer ECS to Kubernetes and just want a minimal, plugin-based EBS mount, rexray/ebs is still the only game in town except for some random github project that might not be production-ready.

Which is a problem because the original developers of rexray have basically come out and said that they view rexray as largely supplanted by CSI - ECS does not support CSI.

commented

this is not simple enough

❤️ love that. Too few people thinking that way!