myoung34 / docker-github-actions-runner

This will run the new self-hosted github actions runners with docker-in-docker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mounting docker.sock appears broken by non-root user change

imintz opened this issue · comments

I'm using docker-in-docker by mounting the docker.sock in the action runner container. The configuration is very similar to the one in the wiki for systemd.

Despite #223 explicitly adding runner to the docker group, the runner user gets a

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json": dial unix /var/run/docker.sock: connect: permission denied

error when attempting to do any docker operations. Using sudo does allow operations to go through but isn't practical/requires a change to many of my workflows.

It looks like the issue is that the docker group gid inside the container is 999 but outside the container it's can be completely different. Creating a group inside the container with the same gid as the host docker group, and then adding the runner user to that new group allows the runner to perform docker operations without sudo.

Would appreciate any advice on fixing this reliably. I have multiple machines on which I host ephemeral runner containers using systemd, each with a potentially different docker gid.

The user id inside the container is stable. You can potentially add that user id to the docker group on the host.

Something like https://stackoverflow.com/a/54504083 might also work.

I dont see an issue in testing (that SO link isnt working since theyre under maintenance):

as root:

$ docker run -v /var/run/docker.sock:/var/run/docker.sock -u 0 -it --entrypoint /bin/bash myoung34/github-runner:latest
root@98fdcd829827:/actions-runner# docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED          STATUS                          PORTS     NAMES
98fdcd829827   myoung34/github-runner:latest   "/bin/bash"              27 seconds ago   Up 26 seconds                             determined_tesla

as runner user:

$ docker run -v /var/run/docker.sock:/var/run/docker.sock -u runner -it --entrypoint /bin/bash myoung34/github-runner:latest
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

runner@4a65471bc7fb:/actions-runner$ docker ps
CONTAINER ID   IMAGE                           COMMAND                  CREATED         STATUS                         PORTS     NAMES
4a65471bc7fb   myoung34/github-runner:latest   "/bin/bash"              2 seconds ago   Up 1 second                              mystifying_stonebraker

Here’s another answer that might help: https://stackoverflow.com/a/38800291

According to microsoft/azure-pipelines-agent#2056 (comment) calling docker run with --group-add docker might also work.

See moby/moby#21184 for more issues like this.

@imintz was any of this useful? I dont see anything inherently wrong with sharing /var/run/docker.sock

Thanks for all of the suggestions! I think the cleanest solution is the --group-add as it required no additional system configuration outside the systemd service However it did require adding the host docker group gid as opposed to just --group-add docker.

Within my systemd service I used

/usr/bin/getent group docker | /usr/bin/cut -d: -f3

to get the host docker group gid. Then passed that via an env variable into the docker-run command.

For reference my service ended up looking something like:

[Service]
TimeoutStartSec=0
Restart=always
ExecStartPre=-/usr/bin/docker stop github-actions-runner-%i
ExecStartPre=-/usr/bin/docker rm github-actions-runner-%i
ExecStartPre=-/usr/bin/docker pull myoung34/github-runner:latest
ExecStartPre=-rm -rf /_work/%i
ExecStartPre=-mkdir -p /_work/%i
ExecStart=/usr/bin/sh -c "export DOCKER_GID=$(/usr/bin/getent group docker | /usr/bin/cut -d: -f3) && \
                          exec /usr/bin/docker run --rm \
                              --env-file /etc/ephemeral-github-actions-runner/worker-%i.env \
                              -e RUNNER_NAME=%H-%i \
                              -e RUNNER_WORKDIR=/_work/%i \
                              -v /var/run/docker.sock:/var/run/docker.sock \
                              -v /_work/%i:/_work/%i \
                              --name github-actions-runner-%i \
                              --group-add $DOCKER_GID \
                              myoung34/github-runner:latest"

I'm sure it can be simplified.

Looks like I spoke too soon. Because the container entrypoint starts as root (doesn't use the docker USER directive), --group-add only adds the group to the root user. For some reason, it also adds the groups when docker exec'ing as any user. I thought it was working because I tested the fix by exec'ing as runner into the container.

Something like https://stackoverflow.com/a/54504083 might also work.

I opted to use setfacl as suggested in the SO link as an ExecStartPre in the systemd unit, and this seems to work in the real CI env.

I'm actually undoing this change in a way.

This is a big change and will affect a lot of people if the version gets bumped. I'd rather allow this behavior via configuration and preserve previous behavior.

I've added a new environment variable RUN_AS_ROOT. Default value is true.

  • If true: preserve old behavior and run as root
  • If true and user is provided with -u (or any orchestrator equiv): error and exit
  • if false: run container as root and assume runner user via gosu
  • if false and user is provided with -u (or any orchestrator equiv): run entire container as runner user

Wiki updating is also coming.

cc @marcus-bcl @Uzlopak