Scope probe logs errors trying to write to app on bridge interface AKA stop scope registering on all interfaces

Question

Scope probe logs errors trying to write to app on bridge interface AKA stop scope registering on all interfaces

argibbs opened this issue 5 years ago · comments

Feature request

I'd like the ability to restrict the interfaces / ip addresses that the scope app registers scope.weave.local. in the weave DNS.

What's going wrong

The scope app is registering scope.weave.local against all available interfaces, including a network bridge which is inaccessible to other machines.

This causes errors to be printed once every 10 seconds in the probe logs, like so:

<probe> ERRO: 2019/10/01 11:40:10.500531 Error fetching app details: Get http://172.22.0.1:4040/api: dial tcp 172.22.0.1:4040: i/o timeout
<probe> ERRO: 2019/10/01 11:40:20.520891 Error fetching app details: Get http://172.22.0.1:4040/api: dial tcp 172.22.0.1:4040: i/o timeout
<probe> ERRO: 2019/10/01 11:40:30.511699 Error fetching app details: Get http://172.22.0.1:4040/api: dial tcp 172.22.0.1:4040: i/o timeout

Reproducing

Machine setup

Let's say I have two machines, host1 and host2. Docker is running on both of them. I'm not using k8s, or anything complicated - this is just a simple case of two servers with separate docker instances.

You'll need to have a network interface on one of the machines (not a docker interface) which is inacccessible to the other machine.

Scope setup

I have weave.net successfully set up on both servers, so that if I run a container on host1, I can successfully ping it from another container on host2, via the weave network.

So far, so good.

Now I spin up scope. To do this I run 3 containers (since this makes the problem clear).

First I spin up the scope app on host1

host1$ docker run --rm --net=host --label "works.weave.role=system" /var/run/docker.sock:/var/run/docker.sock:rw --name test_scope weaveworks/scope:1.11.6 --app.basicAuth --app.basicAuth.username=mario --app.basicAuth.password=wario --app.container.name=test_scope --app-only

Then I spin up the scope probe on host1

host1$ docker run --rm --net=host --pid=host --privileged=true --label "works.weave.role=system" -v /var/run/docker.sock:/var/run/docker.sock:rw --name test_scope_probe weaveworks/scope:1.11.6 --probe.docker=true --probe.basicAuth --probe.basicAuth.username=mario --probe.basicAuth.password=wario --probe-only

Lastly I spin up the scope probe on host2 (note this is the identical command to the previous step, just on a different host).

host2$ docker run --rm --net=host --pid=host --privileged=true --label "works.weave.role=system" -v /var/run/docker.sock:/var/run/docker.sock:rw --name test_scope_probe weaveworks/scope:1.11.6 --probe.docker=true --probe.basicAuth --probe.basicAuth.username=mario --probe.basicAuth.password=wario --probe-only

The probe instance on host1 starts up and sits there doing its thing. The probe instance on host2 starts up and also does its thing. I can successfully see stats from both probes in the app.

However, I can see regular errors in the logs for the probe instance on host2.

Network setup

Running ifconfig on host1 shows the following:

host1$ ifconfig
br-a3e1909c83e5: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.22.0.1  netmask 255.255.0.0  broadcast 172.22.255.255
        ether 02:42:ae:76:02:fb  txqueuelen 0  (Ethernet)

docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:dd:34:54:20  txqueuelen 0  (Ethernet)

docker_gwbridge: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.20.0.1  netmask 255.255.0.0  broadcast 172.20.255.255
        ether 02:42:2f:84:af:f8  txqueuelen 0  (Ethernet)

ens192: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.20.192.113  netmask 255.255.255.0  broadcast 10.20.192.255
        ether 00:50:56:a8:de:1f  txqueuelen 1000  (Ethernet)

weave: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1376
        inet 10.64.0.8  netmask 255.240.0.0  broadcast 10.79.255.255
        ether f2:65:21:0d:4b:f4  txqueuelen 1000  (Ethernet)

Note that br-a3e1909c83e5 is inaccessible from host2.

Weave DNS entry

Once the app and probes are running, I can launch a simple docker instance attached to the weave network on host2 and inspect the dns entry for scope.weave.local

host2$ sudo docker run -it --rm --network=weave --hostname=foo.weave.local centos
[root@foo /]# yum install -y bind-utils
[root@foo /]# nslookup scope.weave.local
Server:         127.0.0.11
Address:        127.0.0.11#53

Name:   scope.weave.local
Address: 172.22.0.1
Name:   scope.weave.local
Address: 10.64.0.8
Name:   scope.weave.local
Address: 10.20.192.113

But this is on host2 - and the IP address 172.22.0.1 is from the bridge interface on host1. It will never be reachable from host2. Hence the probe instance on host2 continuously prints errors on each publish loop.

Versions:

$ scope version
1.11.6 (as per docker command above)
$ docker version
Docker version 18.09.3, build 774a1f4
$ uname -a
13:11 $ uname -a
Linux host1 3.10.0-693.17.1.el7.x86_64 #1 SMP Thu Jan 25 20:13:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Andrew Gibbs · Answer 1 · Tue Oct 01 2019 20:55:04 GMT+0800 (China Standard Time)

Ahhhh, apologies, I just realised I can use --weave.hostname=host1 as an argument to the probe instance to prevent the issue. Closing.