worstcase / blockade

Docker-based utility for testing network failures and partitions in distributed applications

Home Page:http://blockade.readthedocs.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Blockade doesn't work with native driver

aidanhs opened this issue · comments

This is just an umbrella issue. I've found a way to make this work, but I don't know how you want me to make PRs - in small chunks or all at once?

Hey thanks for the issue! I haven't had a chance to try with the native driver. PRs are much appreciated. If your fixes easily break down into smaller chunks, that is of course easier to review. Otherwise a single PR is fine, it might just take me a little longer to review and merge.

Bringing this over to the open issue - my POC uses nsenter to dig into container interface peer_ifindex values and find their pairs.

Do you have a better way of doing this?

Hmm your approach sounds interesting. I'm not sure if my approach is better, or even very portable. I've found that at least on ubuntu trusty you can link the container process' net ns into /var/run/netns, and then use ip netns exec to execute the needed commands in that netns. That way you don't even need to find the interface name. So for example:

# ln -sf /proc/22196/ns/net /var/run/netns/XXX   # 22196 is the container PID dug out of docker inspect
# ip netns exec XXX tc qdisc show dev eth0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

So I was thinking we could create this link for each container when we start it, using something like blockade_id-container_id for the name. However I'm not at all sure how portable this will be. I'd love to see your POC, if you want to push a branch up.

By the way, one goal I do have is to refactor all the commands to optionally run through SSH (via paramiko), to actually support boot2docker. So that goal should inform the approach here.

As of right now you can have a look at https://github.com/docker-in-practice/docker-blockade or dockerinpractice/blockade from the docker hub. I'm using them in Docker in Practice - requiring the readers to use an old version of Docker tends to not impress :)

Taking a snippet from the book:

$ cat >blockade.yml <<'EOF'
containers:
  server:
    image: ubuntu:14.04.2
    command: /bin/sleep infinity
    expose: [10000]

  client1:
    image: ubuntu:14.04.2
    command: sh -c "ping $SERVER_PORT_10000_TCP_ADDR"
    links: ["server"]

  client2:
    image: ubuntu:14.04.2
    command: sh -c "ping $SERVER_PORT_10000_TCP_ADDR"
    links: ["server"]

network:
  flaky: 50%
  slow: 100ms
EOF
$ IMG=dockerinpractice/blockade                                                 
$ docker pull $IMG                                                              
latest: Pulling from dockerinpractice/blockade                                  
[...]                                                                           
Status: Downloaded newer image for dockerinpractice/blockade:latest             
$ alias blockade="docker run --rm --pid=host --privileged -v \$PWD:/blockade -v /var/run/docker.sock:/var/run/docker.sock $IMG"
$ blockade up
NODE            CONTAINER ID    STATUS  IP              NETWORK    PARTITION  
client1         f9b01fc7daee    UP      172.17.0.7      NORMAL                
client2         9aa8a905fa78    UP      172.17.0.8      NORMAL                
server          eca4cdb9c94c    UP      172.17.0.6      NORMAL

The crucial part in the repo is the .patch file, specifically https://github.com/docker-in-practice/docker-blockade/blob/master/docker-blockade.patch#L101. It runs ethtool in the context of the container network, extracts the peer_ifindex and chooses the vethxxx device based on that.

I wasn't such a fan of the symlink approach when I was considering patches for the book because it's another thing to have to clean up.

I’m not sure if this is the right place to ask, but are there any chance blockade is going to support native drivers, since lxc one is deprecated already and will be removed since docker-1.10 (https://github.com/docker/docker/blob/master/docs/misc/deprecated.md#lxc-built-in-exec-driver)

I haven't forgotten about this, but I just haven't had much time to work on blockade lately. I hope to finish this up in December, probably using @aidanhs's approach above. In the meantime I'd be quite happy to accept pull requests.

Just as a note, I dug around in the ethtool source a bit for the docker-in-practice repo and observed that getting peer_ifindex is actually not trivial operation, otherwise I'd have created a simple, tiny static binary for it which blockade could bundle. Since getting peer_ifindex may not actually possible in pure python, you may need to do this anyway.

However, this doesn't help on windows or osx, regardless of the method! I think I concluded that if you want blockade to work cross-platform, you need to make the tiny peer_ifindex binary and then start a privileged 'helper container' which you can insert the required tools into (i.e. created binary + nsenter) and work from there.
No need for paramiko/ssh.

This is resolved in the forthcoming blockade 0.2.0 release, thanks to contribution from @kongo2002. Blockade now works with the latest docker.