Support for --internal network?

Question

Support for --internal network?

dkoshkin opened this issue a year ago · comments

Just found this project and super excited, works great for the default or other non --internal docker networks.

$ docker run --rm --name nginx -d nginx
$ docker inspect nginx --format '{{.NetworkSettings.IPAddress}}'
172.17.0.2
$ curl -m 1 -I 172.17.0.2
HTTP/1.1 200 OK
Server: nginx/1.23.3
Date: Wed, 08 Mar 2023 03:44:47 GMT
Content-Type: text/html
Content-Length: 615
Last-Modified: Tue, 13 Dec 2022 15:53:53 GMT
Connection: keep-alive
ETag: "6398a011-267"
Accept-Ranges: bytes

But it doesn't work if I use an --internal network

$ docker network create --internal internal
$ docker run --rm --name nginx-internal -d nginx --network internal
# note that NetworkSettings.IPAddress is empty
$ docker inspect nginx-internal --format '{{.NetworkSettings.Networks.internal.IPAddress}}'
$ curl -m 1 -I 172.28.0.2
curl: (28) Connection timed out after 1002 milliseconds

Was wondering if its even possible for this to work with an internal network?

Greg Richardson · Answer 1 · Wed Mar 08 2023 14:22:21 GMT+0800 (China Standard Time)

Hey @dkoshkin, thanks for reaching out. Without --internal, Docker creates a bridge network between the container and the host (Linux VM) to give it external access. ~~Since adding --internal removes this bridge, you no longer get access to the Wireguard tunnel that lives on the Linux VM.~~ (not entirely true - see below)

Curious what your use case is using --internal with this tool?

Jimmi Dyson · Answer 2 · Wed Mar 08 2023 17:07:17 GMT+0800 (China Standard Time)

We use it to simulate air-gapped (specifically no egress) networks in docker.

Greg Richardson · Answer 3 · Sat Apr 15 2023 05:18:27 GMT+0800 (China Standard Time)

Got it. Yeah this is a bit of an interesting case, as typically air-gap would exclude all networks, including your host (but of course, that makes dev/debugging hard).

The solution here would be to somehow create a connection only between the --internal container and the macOS host, without passing through the Linux host's network namespace. I'll have to think more about this one.

Greg Richardson · Answer 4 · Sun Apr 23 2023 06:14:58 GMT+0800 (China Standard Time)

Okay I did a deep dive on this today. This is what I discovered (for those curious):

How internal works

I was partially wrong about the bridge network above. For internal networks, a bridge network is created. This makes sense of course - how else would multiple containers talk to each other within the internal network?
The difference between an internal bridge and a regular bridge actually just comes down to a few iptables rules:
- If internal, 2 additional iptables rules are added to the Linux host:
  - iptables -I DOCKER-ISOLATION-STAGE-1 ! -d 172.20.0.0/16 -i br-f653b152ce48 -j DROP
  - iptables -I DOCKER-ISOLATION-STAGE-1 ! -s 172.20.0.0/16 -o br-f653b152ce48 -j DROP
  - Uses br-f653b152ce48 as the example internal bridge interface and 172.20.0.0/16 as the example subnet
- Basically Docker is saying, "If any packet from the internal subnet is going anywhere other than the same network, drop it"
- There are a few additional NAT rules for a normal bridge and missing for an internal bridge that will masquerade when going to an external network - not relevant here

Solution

We need to respect the above isolation rules for internal networks while still giving access to the Wireguard VPN. I think the solution is to add 2 iptables rules to the Linux host:

iptables -I DOCKER-USER -o chip0 -j ACCEPT
iptables -I DOCKER-USER -i chip0 -j ACCEPT

Basically anytime a packet is headed to or from the chip0 (Wireguard) interface, accept it.
Fyi, DOCKER-USER is a custom iptables chain created by Docker intended to be augmented by end users. See:
https://docs.docker.com/network/iptables/

I've tested the above and it works. I think it is safe to add these rules as defaults (rather than some sort of opt-in), because this is more-or-less the default behaviour if you were to run Docker on a vanilla Linux host.