robbertkl / docker-ipv6nat

Extend Docker with IPv6 NAT, similar to IPv4

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exits with "unable to detect hairpin mode (is the docker daemon running?)"

Rycieos opened this issue · comments

Version: v0.4.3 docker.
Docker version: 20.10.1 and 20.10.2
OS: CentOS Linux release 8.3.2011 (Core)

After a system update, upon launching I get this error:

$ docker logs ipv6nat
2021/01/09 17:26:57 unable to detect hairpin mode (is the docker daemon running?)

After which the container exits and restarts.

Thinking it might be a permissions issue, I removed all --cap-adds, leaving only the --cap-drop ALL to test, but that broke it more:

2021/01/09 18:07:38 running [/sbin/iptables -t nat -C OUTPUT -m addrtype --dst-type LOCAL -j DOCKER --wait]: exit status 3: addrtype: Could not determine whether revision 1 is supported, assuming it is.
addrtype: Could not determine whether revision 1 is supported, assuming it is.
iptables v1.8.4 (legacy): can't initialize iptables table `nat': Permission denied (you must be root)
Perhaps iptables or your kernel needs to be upgraded.

I then tried to give it --cap-add ALL, but that did not fix it.

Since part of the system update was docker-ce, I thought maybe it had changed the backend rules, but:

# /sbin/iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 13:09:03 2021
*nat
...
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
...

Clearly the right rule still exists. And checking manually:

# /sbin/iptables -t nat -C OUTPUT -m addrtype --dst-type LOCAL -j DOCKER --wait; echo "$?"
iptables: Bad rule (does a matching rule exist in that chain?).
1
# /sbin/iptables -t nat -C OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER --wait; echo "$?"
0

The actual checking commands returns correctly as expected. I am using this code section as the reference: https://github.com/robbertkl/docker-ipv6nat/blob/v0.4.3/manager.go#L79-L86

At this point I downgraded dockerd back to 20.10.1, but I got the same error.

What is strange is that when I first did the system upgrade, dockerd restarted itself as usual, and all my containers came back online with IPv6 working. It was after an OS restart that this error started.

I tried to do a system rollback, but the old package versions couldn't be found, so I'm stuck.

Full package list that I upgraded:

Package New Version Old Version
NetworkManager 1:1.26.0-12.el8_3.x86_64 1:1.26.0-9.el8_3.x86_64
NetworkManager-libnm 1:1.26.0-12.el8_3.x86_64 1:1.26.0-9.el8_3.x86_64
NetworkManager-team 1:1.26.0-12.el8_3.x86_64 1:1.26.0-9.el8_3.x86_64
NetworkManager-tui 1:1.26.0-12.el8_3.x86_64 1:1.26.0-9.el8_3.x86_64
gnutls 3.6.14-7.el8_3.x86_64 3.6.14-6.el8.x86_64
iptables 1.8.4-15.el8_3.3.x86_64 1.8.4-15.el8.x86_64
iptables-ebtables 1.8.4-15.el8_3.3.x86_64 1.8.4-15.el8.x86_64
iptables-libs 1.8.4-15.el8_3.3.x86_64 1.8.4-15.el8.x86_64
iptables-services 1.8.4-15.el8_3.3.x86_64 1.8.4-15.el8.x86_64
iwl100-firmware 39.31.5.1-101.el8_3.1.noarch 39.31.5.1-99.el8.1.noarch
iwl1000-firmware 1:39.31.5.1-101.el8_3.1.noarch 1:39.31.5.1-99.el8.1.noarch
iwl105-firmware 18.168.6.1-101.el8_3.1.noarch 18.168.6.1-99.el8.1.noarch
iwl135-firmware 18.168.6.1-101.el8_3.1.noarch 18.168.6.1-99.el8.1.noarch
iwl2000-firmware 18.168.6.1-101.el8_3.1.noarch 18.168.6.1-99.el8.1.noarch
iwl2030-firmware 18.168.6.1-101.el8_3.1.noarch 18.168.6.1-99.el8.1.noarch
iwl3160-firmware 1:25.30.13.0-101.el8_3.1.noarch 1:25.30.13.0-99.el8.1.noarch
iwl3945-firmware 15.32.2.9-101.el8_3.1.noarch 15.32.2.9-99.el8.1.noarch
iwl4965-firmware 228.61.2.24-101.el8_3.1.noarch 228.61.2.24-99.el8.1.noarch
iwl5000-firmware 8.83.5.1_1-101.el8_3.1.noarch 8.83.5.1_1-99.el8.1.noarch
iwl5150-firmware 8.24.2.2-101.el8_3.1.noarch 8.24.2.2-99.el8.1.noarch
iwl6000-firmware 9.221.4.1-101.el8_3.1.noarch 9.221.4.1-99.el8.1.noarch
iwl6000g2a-firmware 18.168.6.1-101.el8_3.1.noarch 18.168.6.1-99.el8.1.noarch
iwl6050-firmware 41.28.5.1-101.el8_3.1.noarch 41.28.5.1-99.el8.1.noarch
iwl7260-firmware 1:25.30.13.0-101.el8_3.1.noarch 1:25.30.13.0-99.el8.1.noarch
kexec-tools 2.0.20-34.el8_3.1.x86_64 2.0.20-34.el8.x86_64
linux-firmware 20200619-101.git3890db36.el8_3.noarch 20200619-99.git3890db36.el8.noarch
microcode_ctl 4:20200609-2.20201112.1.el8_3.x86_64 4:20200609-2.20201027.1.el8_3.x86_64
systemd 239-41.el8_3.1.x86_64 239-41.el8_3.x86_64
systemd-libs 239-41.el8_3.1.x86_64 239-41.el8_3.x86_64
systemd-pam 239-41.el8_3.1.x86_64 239-41.el8_3.x86_64
systemd-udev 239-41.el8_3.1.x86_64 239-41.el8_3.x86_64
tuned 2.14.0-3.el8_3.1.noarch 2.14.0-3.el8.noarch
tzdata 2020f-1.el8.noarch 2020d-1.el8.noarch
docker-ce 3:20.10.2-3.el8.x86_64 3:20.10.1-3.el8.x86_64
docker-ce-cli 1:20.10.2-3.el8.x86_64 1:20.10.1-3.el8.x86_64
docker-ce-rootless-extras 20.10.2-3.el8.x86_64 20.10.1-3.el8.x86_64

Seems like coreos/go-iptables/issues/79 could be related.

Hi @Rycieos, thanks for the extensive report!

Could you try a few things for me?

  • Do you use --network host? And could you try with --privileged instead of the --cap flags?
  • Could you try to use robbertkl/ipv6nat:0.4.2 instead of latest? I've recently upgraded go-iptables, so this could indeed be related to the issue you mentioned.

I am running with --network host, yes.

Trying --privileged instead has the same result.

I forgot to mention, I was running v0.4.1 I think before debugging, and updated to see if it resolved my issue. I just tested versions v0.4.1, v0.4.2, and v0.4.3 with both --privileged and --cap-adds, same results.

Trying --privileged instead has the same result

Just to be sure: in that case you're running the ipv6nat container with --network host AND --privileged?

And before the system upgrade, it was all working fine?

v0.4.1 and v0.4.2 have been around for quite a while already, but haven't seen an issue like this before. And if the issue started only after the system upgrade, that makes it only stranger, and doesn't seem related to go-iptables. I've also upgraded to newer versions of Alpine and Go for the v0.4.3 container, but this would be unrelated as well, since you have the same issue with v0.4.1 and v0.4.2.

The iptables upgrade of your systems goes from 1.8.4-15.el8.x86_64 to 1.8.4-15.el8_3.3.x86_64, but can't find what the changes are between those. Also, you mentioned doing the check yourself works just fine..

A few more things to try:

  • Can you try to list the iptables rules (iptables-save -t nat) and run the check commands (-C) from within the docker-ipv6nat container image? Since you can't attach to the running container (it keeps crashing), you could start temporary container with something like docker run --rm -it --privileged --network host --entrypoint sh robbertkl/ipv6nat and run the commands from there.

  • Can you try running the docker-ipv6nat binary directly on the host (so not in a docker container)?

Can you try to list the iptables rules (iptables-save -t nat) and run the check commands (-C) from within the docker-ipv6nat container image? Since you can't attach to the running container (it keeps crashing), you could start temporary container with something like docker run --rm -it --entrypoint sh robbertkl/ipv6nat and run the commands from there.

Good idea!

Sanity check from host:

$ sudo iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 16:06:51 2021
*nat
...
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
...

Container:

$ docker run --rm -it --privileged --network host --entrypoint sh robbertkl/ipv6nat
/ # iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 21:06:41 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
COMMIT
# Completed on Sat Jan  9 21:06:41 2021

Well that isn't good. Same thing if I do docker run --rm -it --cap-add NET_ADMIN --cap-add NET_RAW --network host --entrypoint sh robbertkl/ipv6nat instead.

Any ideas? I'll keep digging.

From within the container (docker run, again with --privileged --network host), could you run these 3 commands:

ls -l `which iptables`
xtables-legacy-multi iptables-save -t nat
xtables-nft-multi iptables-save -t nat

Sorry, forgot:

Before running the ls -l command, please run ./docker-ipv6nat-compat once.

$ docker run --rm -it --cap-add ALL --network host --entrypoint sh robbertkl/ipv6nat
/ # ls -l `which iptables`
lrwxrwxrwx    1 root     root            20 Dec 28 22:49 /sbin/iptables -> xtables-legacy-multi

/ # xtables-legacy-multi iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 21:17:25 2021
*nat
:PREROUTING ACCEPT [837:70639]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [138:9289]
:POSTROUTING ACCEPT [942:73933]
COMMIT
# Completed on Sat Jan  9 21:17:25 2021

/ # xtables-nft-multi iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 21:17:29 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/32 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 10.1.0.0/32 ! -o br-dce386402b8e -j MASQUERADE
...
-A POSTROUTING -s 10.1.0.2/32 -d 10.1.0.2/32 -p tcp -m tcp --dport 443 -j MASQUERADE
-A POSTROUTING -s 10.1.0.2/32 -d 10.1.0.2/32 -p tcp -m tcp --dport 80 -j MASQUERADE
...
-A OUTPUT ! -d 127.0.0.0/32 -m addrtype --dst-type LOCAL -j DOCKER
-A DOCKER -i docker0 -j RETURN
-A DOCKER -i br-dce386402b8e -j RETURN
...
-A DOCKER ! -i br-dce386402b8e -p tcp -m tcp --dport 443 -j DNAT --to-destination 10.1.0.2:443
-A DOCKER ! -i br-dce386402b8e -p tcp -m tcp --dport 80 -j DNAT --to-destination 10.1.0.2:80
...
COMMIT
# Completed on Sat Jan  9 21:17:29 2021
# Warning: iptables-legacy tables present, use iptables-legacy-save to see them

That is identical to what I see on the host. Does that mean the iptables update I got switched from the legacy backend to a newer backend?

And before the system upgrade, it was all working fine?

Correct.

Oops, I missed this before:
Running on the host:

$ sudo iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 16:15:14 2021
...
# Completed on Sat Jan  9 16:15:14 2021
# Warning: iptables-legacy tables present, use iptables-legacy-save to see them
$ sudo iptables-legacy-save -t nat
sudo: iptables-legacy-save: command not found

That happens with iptables -L as well. I don't remember seeing that warning before, so something must have changed with my update.

Sorry, forgot:

Before running the ls -l command, please run ./docker-ipv6nat-compat once.

$ docker run --rm -it --cap-add NET_ADMIN --cap-add NET_RAW --network host --entrypoint sh robbertkl/ipv6nat
/ # ls -l `which iptables`
lrwxrwxrwx    1 root     root            20 Dec 28 22:49 /sbin/iptables -> xtables-legacy-multi

/ # ./docker-ipv6nat-compat
2021/01/09 21:25:59 unable to detect hairpin mode (is the docker daemon running?)

/ # ls -l `which iptables`
lrwxrwxrwx    1 root     root            17 Jan  9 21:25 /sbin/iptables -> xtables-nft-multi

Looks like that fixes it to point to the right one. I'm assuming the standard entry point must not be doing that, or it should be working for me.

It looks like your system has indeed switched backend, but docker-ipv6nat-compat should pick that up and symlink to the correct version.

Could you try the previous ls -l command again, but this time after you run ./docker-ipv6nat-compat once?

Also: do you normally run the ipv6nat container with the entrypoint set? Or does it use the default entrypoint?

Also: do you normally run the ipv6nat container with the entrypoint set? Or does it use the default entrypoint?

Default. Here is my full docker-compose file:

version: '2.3'
services:
  ipv6nat:
    image: robbertkl/ipv6nat:0.4.3
    restart: always
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    network_mode: host
    cap_drop:
      - ALL
    cap_add:
      - NET_RAW
      - NET_ADMIN
      - SYS_MODULE

Hmm, so it does switch over to xtables-nft-multi (docker-ipv6nat-compat is actually the standard entrypoint), but still doesn't work? That's very strange.

1 thing that looks off in your output is that:

  • xtables-legacy-multi iptables-save -t nat shows no rules, but it does have counters behind each chain (the numbers in brackets)
  • xtables-nft-multi iptables-save -t nat shows Docker's NAT rules, but the counters are all 0

I think I got it!

On host:

$ sudo iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 16:34:29 2021
*nat
...
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER

In container (I hacked the entrypoint to not exit after it crashes):

± docker exec -it ipv6nat_ipv6nat_1 sh
/ # ls -l `which iptables`
lrwxrwxrwx    1 root     root            17 Jan  9 21:32 /sbin/iptables -> xtables-nft-multi

/ # iptables-save -t nat
# Generated by iptables-save v1.8.4 on Sat Jan  9 21:33:13 2021
*nat
...
-A OUTPUT ! -d 127.0.0.0/32 -m addrtype --dst-type LOCAL -j DOCKER

It shows the address as 127.0.0.0/32, not 127.0.0.0/8!

Wow, that should definitely explain it!

Any idea why that would happen and why only after a system upgrade? Very strange how iptables-save (both version 1.8.4) show different things on the host and in container for the same rule!

Actually, I should stop truncating output. It shows that for every single rule!

-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE

turns into

-A POSTROUTING -s 172.17.0.0/32 ! -o docker0 -j MASQUERADE

Same thing is done on the filter table as well.

Agreed, very strange.

I'll try digging into why it warns about legacy tables, maybe there is a RedHat bug somewhere about this that could provide some clues.

Not sure, but perhaps the legacy tables were created by the iptables commands within the container before running docker-ipv6nat-compat (by default the image is set to legacy). I would recommend a reboot to see if the warning goes away.

The counters are still a bit strange, however. Is Docker's NAT working OK for IPv4 after the upgrade? Can you reach the published ports?

Not sure, but perhaps the legacy tables were created by the iptables commands within the container before running docker-ipv6nat-compat (by default the image is set to legacy). I would recommend a reboot to see if the warning goes away.

Yeah, I'll try a reboot.

The counters are still a bit strange, however. Is Docker's NAT working OK for IPv4 after the upgrade? Can you reach the published ports?

Yup, IPv4 works just fine. The counters look normal from the host, it's just the chain global policy counters that have 0s:

± sudo iptables -t nat -L -v
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
 1567 92212 DOCKER     all  --  any    any     anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MASQUERADE  all  --  any    !docker0  172.17.0.0/16        anywhere
10396  629K MASQUERADE  all  --  any    !br-dce386402b8e  10.1.0.0/16          anywhere
...
    0     0 MASQUERADE  tcp  --  any    any     10.1.0.2             10.1.0.2             tcp dpt:https
    0     0 MASQUERADE  tcp  --  any    any     10.1.0.2             10.1.0.2             tcp dpt:http
...

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 DOCKER     all  --  any    any     anywhere            !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain DOCKER (2 references)
 pkts bytes target     prot opt in     out     source               destination
    0     0 RETURN     all  --  docker0 any     anywhere             anywhere
    0     0 RETURN     all  --  br-dce386402b8e any     anywhere             anywhere
...
  174  9100 DNAT       tcp  --  !br-dce386402b8e any     anywhere             anywhere             tcp dpt:https to:10.1.0.2:443
   14   648 DNAT       tcp  --  !br-dce386402b8e any     anywhere             anywhere             tcp dpt:http to:10.1.0.2:80
...
# Warning: iptables-legacy tables present, use iptables-legacy to see them

The second reboot did clear up the warnings of legacy tables existing. But it did not fix the problem. Iptables in the container still shows /32 for all ip ranges.

The second reboot did clear up the warnings of legacy tables existing.

But only until execed into the container and ran iptables-legacy-save (symlinked as iptables-save, which I forgot about), which created legacy tables! So that is where the legacy rules are coming from, and why I never saw the warning before. I can't seem to find a way to remove these rules without rebooting.

Yeah, that's what I thought created the legacy rules. But if you're exec'ing into the container, the container would have already executed the docker-ipv6nat-compat script and symlinked iptables-save to xtables-nft-multi, right? Or do you mean a manually started container?

Or do you mean a manually started container?

Yeah, I just did that in a manually started container after rebooting without thinking.

I'm still stumped as to how a iptables patch update could cause this. Especially since the version that is showing wrong output is in the container and wasn't changed.

It could still be that the update switched the backend from legacy to nft and something is not working properly in the "translation" to iptables output, which iptables-nft does. Can't confirm on my current system, as it's using legacy; will have to set up a new machine to test it.

Could you try installing iptables 1.8.6 from Alpine "edge" within the container:

  • Either exec into your ipv6nat container or run a new container and manually run docker-ipv6nat-compat, so the correct symlinks are made
  • Run this to upgrade iptables: apk upgrade iptables ip6tables --no-cache --repository=http://dl-cdn.alpinelinux.org/alpine/edge/main
  • Check the iptables-save output again

No dice:

± docker run --rm -it --cap-add NET_ADMIN --cap-add NET_RAW --network host --entrypoint sh robbertkl/ipv6nat

/ # apk upgrade iptables ip6tables --no-cache --repository=http://dl-cdn.alpinelinux.org/alpine/edge/main
fetch http://dl-cdn.alpinelinux.org/alpine/edge/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
Upgrading critical system libraries and apk-tools:
(1/2) Upgrading musl (1.1.24-r10 -> 1.2.2_pre7-r0)
(2/2) Upgrading apk-tools (2.10.5-r1 -> 2.12.0-r4)
Executing busybox-1.31.1-r19.trigger
Continuing the upgrade transaction with new apk-tools:
fetch http://dl-cdn.alpinelinux.org/alpine/edge/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/2) Upgrading iptables (1.8.4-r2 -> 1.8.6-r0)
(2/2) Upgrading ip6tables (1.8.4-r2 -> 1.8.6-r0)
Executing busybox-1.31.1-r19.trigger
OK: 8 MiB in 18 packages

/ # ./docker-ipv6nat-compat
2021/01/09 22:55:45 unable to detect hairpin mode (is the docker daemon running?)

/ # iptables-save -V
iptables-save v1.8.6 (nf_tables)

/ # iptables-save -t nat
# Generated by iptables-save v1.8.6 on Sat Jan  9 22:55:53 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/32 ! -o docker0 -j MASQUERADE
...
-A POSTROUTING -s 10.1.0.7/32 -d 10.1.0.7/32 -p tcp -m tcp --dport 443 -j MASQUERADE
-A POSTROUTING -s 10.1.0.7/32 -d 10.1.0.7/32 -p tcp -m tcp --dport 80 -j MASQUERADE
-A OUTPUT ! -d 127.0.0.0/32 -m addrtype --dst-type LOCAL -j DOCKER

I was just able to track down the rpm files of the previous version of iptables I had installed. I'm going to try a manual roll back and see if it fixes my problem.

Great, let me know. Still blows my mind how it can affect only the result within the container.

Also wondering how iptables on the host could have an effect in the first place: I don't think it's even used? Your system uses nftables and I think it's only iptables-nft in the container that talks directly to nftables. I don't think it's even using the iptables installed on the host.

Well that fixed it. 🤕

Feel free to close this issue if you want, since it seems to be a problem with an external package. Though if you think it is a compatibility issue, I would be happy to help you continue to debug it (though not on my home prod system).

Full fix detailed, in case anyone else has the exact same stuck package scenario:

# Get all "old" packages
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-1.8.4-15.el8.x86_64.rpm
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-ebtables-1.8.4-15.el8.x86_64.rpm
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-libs-1.8.4-15.el8.x86_64.rpm
$ wget http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/iptables-services-1.8.4-15.el8.x86_64.rpm

$ sudo yum downgrade ./iptables-*

# Destroy the container, just in case any loaded kernel modules stick around
$ docker rm -f ipv6nat

# Rebooting was the only thing that fixed it for me
$ sudo reboot

# Recreate the container
$ docker run -d --name ipv6nat --cap-drop ALL --cap-add NET_ADMIN --cap-add NET_RAW --network host --restart unless-stopped -v /var/run/docker.sock:/var/run/docker.sock:ro robbertkl/ipv6nat
# or whatever your command is

Thanks for all your help tracking down what was causing the problem!

Your system uses nftables and I think it's only iptables-nft in the container that talks directly to nftables. I don't think it's even using the iptables installed on the host.

You are right I think. I guess somehow the new package version has some bug that interacts with the kernel incorrectly, and then saves (and then later prints) the rules incorrectly?? Yeah, doesn't make sense to me either. I wouldn't even know how to go about reporting this as a bug to the package maintainer. I guess I would need to prove that the rule actually got saved wrong somehow.

Also wondering how iptables on the host could have an effect in the first place: I don't think it's even used?

Makes me wonder if I could have mounted the host xtables-nft-multi binary in the container to fix it. Probably only if it was statically linked, since the container runs on Alpine (MUSL based IIRC).

Well that fixed it. 🤕

Wow, that was quite the journey. Great you figured it out! And thanks for the detailed fix.

Let's leave it at this. I'll keep an eye out for more reports of this issue.

Makes me wonder if I could have mounted the host xtables-nft-multi binary in the container to fix it. Probably only if it was statically linked, since the container runs on Alpine (MUSL based IIRC).

Yeah, that's usually a no-go, for that exact reason.

I can confirm that this error occurs with a fresh install of CentOS.
The downgrade solution helped for me.

Thanks @thedejavunl, best to keep the issue open then.

I spoke to Phil Sutter from RedHat, who did both the upstream patch as well as its backport into RHEL8.3.

The commit in question is here. To quote Phil:

Sadly it is not part of an official release yet, ETA is v1.8.7.

About the issue we're seeing in the Docker container:

Basically it's a problem with data representation inside the container. The iptables binary in there doesn't respect the reduced payload expression length and due to absence of the (not needed) bitwise expression assumes the full address is being matched.

So aside from the workaround (downgrading as detailed here) I guess the only solution would be to either wait for 1.8.7 (and its Alpine edge packages) or build a patched version and ship that in the container image.

Wow @robbertkl, fantastic detective work!

To so to be 100% clear, this "backport" in the RedHat packages happened between versions 1.8.4-15.el8 and 1.8.4-15.el8_3.3?

And this change is expected to land in iptables v1.8.7?

As running two different versions of iptables against the same kernel probably wasn't ever intended, I can understand why this could happen.

As for my own environment, I will just freeze my iptables version to 1.8.4-15.el8 until v1.8.7 is released and updated here.

To so to be 100% clear, this "backport" in the RedHat packages happened between versions 1.8.4-15.el8 and 1.8.4-15.el8_3.3?

Correct, see the changelog here: https://centos.pkgs.org/8/centos-baseos-x86_64/iptables-services-1.8.4-15.el8_3.3.x86_64.rpm.html

Thank you very much for the explanation @robbertkl. 🥇

Hey,

does anybody know if there is an redhat bugtracker record to this?

As running two different versions of iptables against the same kernel probably wasn't ever intended, I can understand why this could happen.

If this is the issue, couldn't it be fixed by upgrading the ipv6net Docker container to use a newer version of iptables? Maybe as an opt-in (eg. a new tag)

The new IPTables package is included in Alpine Edge. It is not intended for production usage. The IPTables upgrade doesn't have any security fixes. Only bug fixes so if the firewall is working correctly the need to update the RPM packages is zero.

If this is the issue, couldn't it be fixed by upgrading the ipv6net Docker container to use a newer version of iptables? Maybe as an opt-in (eg. a new tag)

That seems to be the plan, once iptables is updated in Alpine. We all seem to agree that we should wait until it is "stable" before doing that.

Hi all,

With Docker 20.10.6 the ipv6nat function is fully intergrated (experimental).
You can add the following flags to your daemon.json:
{ "ipv6": true, "fixed-cidr-v6": "fd00::/80", "experimental": true, "ip6tables": true }

Heya,

as we still run into this issue, i did some research and also spoke to Phil about it a bit to understand it.

The fix/backport implemented a more optimized way to store rules in the kernel. Now the issue is the following: if the host supports this type of storing rules, but the container iptables doesn't, the output is messed up. It looks like so:

host # iptables-save | grep "LOCAL -j DOCKER"
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER

container # xtables-nft-multi iptables-save | grep "LOCAL -j DOCKER"
-A OUTPUT ! -d 127.0.0.0/32 -m addrtype --dst-type LOCAL -j DOCKER

So the rule is differently displayed, but the rule is "correct" (as inside the kernel).

As this is not easy to solve, as the version outside and inside the container must be "the same", may i suggest the following:
the check that actually fails is part of the manager.go:

func detectHairpinMode() (bool, error) {

If the check would be changed to accept both versions, /8 and /32, the problem should be "gone".

Anything i missed? Would it be worth a try? I'm not a go-coder, so i have no clue how to do it myself, but i expect it to be
"easy" to fix, at least much easier than getting both versions in sync.

Cheers,
Sven

Just pushed out a new release v0.4.4 which contains the fix for this issue! Docker images for all architectures are on Docker Hub as :0.4.4 and :latest. Thanks everyone!

@robbertkl I still get "unable to detect hairpin mode (is the docker daemon running?)" with 0.4.4 on synology DSM 6.2.4-25556 with synology-current docker version 20.10.3-0554 . I use Option B from the README.md. IPv6 works in general on the system