Overlayfs does not work with unix domain sockets
analytically opened this issue · comments
This issue (unix sockets not working) affects the container's root filesystem, including /var/ and /tmp folders. But not any kind of a volume (either type). So long as that volume's underlying filesystem is 'normal' for linux and will permit the creation of working sockets.
@analytically Can you please summarize what this bug is?
Unix domain sockets don't seem to work when using overlayfs. When using device mapper, it works fine.
Ah right. Ok then. That is the newest linux kernel filesystem, to replace device mapper & aufs in the future. Sorry I was confused by the title. Because some of us use the word 'overlay' to refer to something else entirely.
"overlayfs does not work with unix domain sockets"
Just doing a simple go program with a unix socket listener works for me.
I can reproduce with image from @analytically
hahah, of course debugging is great.
bash-4.3# strace -o /log -fff supervisord -k -c /etc/supervisord.conf
2015-05-13 15:52:08,211 CRIT Supervisor running as root (no user in config file)
2015-05-13 15:52:08,338 INFO RPC interface 'supervisor' initialized
2015-05-13 15:52:08,339 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-05-13 15:52:08,341 INFO supervisord started with pid 22
2015-05-13 15:52:09,350 INFO spawned: 'test-unix-domain-socket' with pid 25
2015-05-13 15:52:10,354 INFO success: test-unix-domain-socket entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2015-05-13 15:52:10,436 INFO exited: test-unix-domain-socket (exit status 0; expected)
^C2015-05-13 15:53:24,850 WARN received SIGINT indicating exit request
bash-4.3# supervisord -k -c /etc/supervisord.conf
2015-05-13 15:53:42,591 CRIT Supervisor running as root (no user in config file)
2015-05-13 15:53:42,664 INFO RPC interface 'supervisor' initialized
2015-05-13 15:53:42,665 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2015-05-13 15:53:42,665 INFO supervisord started with pid 51
2015-05-13 15:53:43,671 INFO spawned: 'test-unix-domain-socket' with pid 54
2015-05-13 15:53:43,996 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:45,002 INFO spawned: 'test-unix-domain-socket' with pid 58
2015-05-13 15:53:45,326 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:47,333 INFO spawned: 'test-unix-domain-socket' with pid 62
2015-05-13 15:53:47,663 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:50,672 INFO spawned: 'test-unix-domain-socket' with pid 66
2015-05-13 15:53:51,007 INFO exited: test-unix-domain-socket (exit status 0; not expected)
2015-05-13 15:53:52,009 INFO gave up: test-unix-domain-socket entered FATAL state, too many start retries too quickly
^C2015-05-13 15:53:56,395 WARN received SIGINT indicating exit request
bash-4.3#
Why was this removed from 1.7.0? We are currently stuck in a situation where no storage backend is able to fulfill our needs, and this is one of the blockers for us for Overlay...
@stevenbrichards we were really hoping to get overlay as the default graph driver in 1.7, but there are several issues around overlay remaining; some of them are caused by bugs in overlay itself so cannot be solved by docker, but need to be fixed in the kernel first.
Rest assured that improving overlay is top priority in docker, but getting it resolved before the 1.7 release just wasn't possible.
On a brighter note; starting with the 1.7 release, there will be an "experimental" docker release, with a nightly or weekly release (undecided yet). That release will be used to test upcoming features before they end up in the official release. So, once this is fixed, you'll be able to test it in that release.
Oops, wrong autocomplete; meant to use @stevenschlansker (apologies)
I believe the kernel issue here will be fixed in Linux 4.1.
@philips thanks for the heads up; looking forward to that!
I added this to the 1.8 milestone; I must admit I'm not sure this is something to be fixed in Docker or overlay, but I'm keeping it open for now so that it's easier to find for people running into this.
@thaJeztah we are looking forward to it too!
I don't think there is anything that the docker engine can do differently but I agree we should absolutely keep this tracking bug issue.
@philips do you have a link to the kernel bug tracking where it talks about this issue? I just upgraded to 4.1.2 and it looks like this is still a problem.
@calavera There is no tracking issue AFAIK. I pinged Miklos.
@philips - is this bug relevant : https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1214500
I'm hitting this problem as well - is there a short term solution, like switching to devicemapper or something ? we use unix sockets in our nginx and supervisord configurations
You can potentially place your UNIX sockets into a volume (or a bind-mount from the host), depending on the backing storage for your volumes / host storage.
Right then. So just to be clear:
This issue (unix sockets not working) affects the container's root filesystem, including
/var/
and/tmp
folders. But not any kind of a volume (either type). So long as that volume's underlying filesystem is 'normal' for linux and will permit the creation of working sockets.
Is that a correct decription of this issue? Please confirm Y/N.
@analytically please update your issue description ^^ at top pf page accordingly. So that other users (such as myself) can understand clearly the limits of this problem in a few sentences.
@stevenschlansker - confirmed working. Had to change a whole bunch of socket paths to "/var/run", but everything seems to be working fine.
My main concern is whether I'm just lucking out and one day everything will fall apart .
No luck involved -- as long as your socket isn't trying to reside on overlay, you'll be fine.
@sandys it is a kernel issue, I think the launchpad bug is filed correcrtly.
Still seeing this on Kernel 4.1.6 and Docker 1.8.2, build 0a8c2e3.
@analytically looking at the linked launchpad issue; https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1214500 this hasn't been fixed yet in the kernel, so yes, nothing a docker update can do to resolve that
Hey there, the bug in launchpad haven't seen no update for close to a year.
I am stuck with device-mapper and 12sec removal times, so I really would like to see progress. :)
Was anybody able to create simpler case? I tried to just listen on socket on overlayfs and it seems work for me :/
@akram Yup, reproduced, thanks! Now it's time to trace kernel.
Should this be pushed in the kernel-bugtracker (https://bugzilla.kernel.org/) as well?
I found problem in kernel, but I don't know how to fix it :)
@LK4D4 links?
@vbatts
here dentry for hardlink will have new inode, but code expects to find unix-socket under same inode. So even if both dentrys(dentry on overlay) have upperdentry(dentry on ext4) with same inode, unix_find_socket_byinode
uses their inodes directly which is different. I tried to just d_instantiate
inode for hardlink with inode from source, but got kernel BUG at fs/inode.c:1493
on removal of both original file and hardlink (which is expected because I hardly understand what's going on). However unix-socket works in that case.
Also here results which might be helpful to find root cause:
sock in upper, link in target - can connect in upper, can't in target
sock in upper, link in upper - can connect in upper, can't in target
sock in target, link in target - can't connect
sock in target, link in upper - can't connect
Also little confusing that vfs_fstat
returns ino
for upperdentry
. I can try to use it somehow to get real inode after kern_path
, but not sure that it won't break stuff for other filesystems.
I think I got it, at least without panic. Testing with docker now.
EDIT: yup, it works here is patch if someone brave enough: https://gist.github.com/anonymous/ad0af1c00aedd27ee0b4
@LK4D4 : I tested your patch both with my minimalist reproducer and with my “real-life” use-case with docker and it works for me.
However patch contains bug, which I fixed in my branch. I'll try to send it to lkml today.
When do we expect this to be merged back into the main docker repo?
@codingwithoutcomments this is for the kernel, not Docker.
FWIW, I "solved" the supervisord
issue above by changing the sock to /dev/shm/supervisor.sock
.
@kung-foo thanks, possibly that's a workaround for some others as well
@LK4D4 https://git.kernel.org/cgit/linux/kernel/git/mszeredi/vfs.git/commit/?h=overlayfs-linus&id=30402c8949934fbaca07d9c20074d0d7a5a8385f this commit is in the merge queue for kernel 4.7-rc - does this fix this issue?
https://lkml.org/lkml/2016/6/16/87 fixes it for me.
@brauner, right that's the PR which includes the commit above, thanks for checking, we may want to close this once 4.7 lands
@runcom, yeah I know, just realized too late that you already posted. :)
Seems like this can be closed with 4.7
Was looking in wrong place, looks like made it into 4.7-rc4
https://lkml.org/lkml/2016/6/20/5
torvalds/linux@30402c8
Agreed, going to close. Anyone coming across this issue after close please try upgrading to 4.7 first.
For those not wanting to track things down for ubuntu 16.04, it looks like the relevant changes have been backported into 4.4.0-35.54: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1607404
Anyone facing this issue on macOS Sierra (version 10.12.1) while using Docker Version 1.13.0 (15072)?
I faced this issue in CentOS 7
Docker version 1.13.1, build 092cba3
@analytically This is because the hard link to a domain socket file can not work in overlay fs.
Supervisor/supervisor#1067
Issue seen on 3.10.0-327.36.3.el7.x86_64, is this back ported to any 3.10 series kernel?
@srinivassurishetty 3.10.0-327 is quite old. Please make sure to update the kernel to the latest for the distro.
@srinivassurishetty that looks like RHEL numbering. If that's the case, then it is old even by RHEL standards (something like RHEL 7.2). This bug has been fixed in more recent RHEL - make sure you use an up to date RHEL and an xfs volume that has ftype=1 (c.f. output of xfs_info <device>
), especially if you are not using a dedicated volume for /var/lib/docker/overlay, as RHEL historically had ftype=0 for the root device.
Thanks you @cpuguy83 @EricMountain-1A for comments.
Found the root cause.
@EricMountain-1A as you pointed out the problem with overlayfs and same kernel works with devicemapper.
root cause.
By default Ansible creates the SSH control sockets under ~/.ansible/cp directory.
Creation of socket will not work well with overlayfs (docker driver limitation given link below). So changed the control socket directory path and working fine across all storage drivers and kernels.
https://docs.docker.com/storage/storagedriver/overlayfs-driver/
Docker also recommending to not use the < 3.10.0-514 version to use overlay2 storage driver.
[ssh_connection]
control_path_dir=/vol/