Self-hosted runner with Docker step creates files that trip up the checkout step

Question

Self-hosted runner with Docker step creates files that trip up the checkout step

j3parker opened this issue 4 years ago · comments

Describe the bug
When using self-hosted runners, git checkouts are cached between runs (this is nice, because it greatly speeds up our builds).

However, if a docker-based step writes a file to the workspace it will (possibly) be owned by the root user. If the permissions don't give the (non-root) action runner user +w permission then a checkout step in a future workflow run will fail to remove this file. The first time, the error will look like this:

##[group]Cleaning the repository
[command]/usr/bin/git clean -ffdx
warning: could not open directory 'foo/': Permission denied
warning: failed to remove foo/: Directory not empty
##[endgroup]
##[warning]Unable to clean or reset the repository. The repository will be recreated instead.
Deleting the contents of '/home/jparker/actions-runner/_work/self-hosted-runner-permissions-issue-repro/self-hosted-runner-permissions-issue-repro'
##[error]Command failed: rm -rf "/home/jparker/actions-runner/_work/self-hosted-runner-permissions-issue-repro/self-hosted-runner-permissions-issue-repro/foo"
rm: cannot remove '/home/jparker/actions-runner/_work/self-hosted-runner-permissions-issue-repro/self-hosted-runner-permissions-issue-repro/foo': Permission denied

So git clean -ffdx tried to stat() this foo/ directory (created via a container in a previous build) but failed. It was then unable to remove the directory because it wasn't empty. It tried to fall back to rm -rf which failed for the same reasons.

In future builds it goes straight to rm -rf because the .git folder did get cleaned up. It continues to fail in the same way for all future builds. Here's a screenshot:

To Reproduce

I've created a repo that reproduces the error: https://github.com/Brightspace/self-hosted-runner-permissions-issue-repro

Here's an example of a workflow failing: https://github.com/Brightspace/self-hosted-runner-permissions-issue-repro/runs/596011452?check_suite_focus=true

Expected behavior

I guess I'd expect all the files to be owned by the runner user... in a perfect world. Maybe that can be done with user namespace maps? Documentation. Not sure what that would entail though or if it makes sense for what the runner is doing.

I think this is not an issue with the checkout action because I don't think there is anything they could do about it - it'd impact other actions too, checkout was just the first one I hit the issue with.

Runner Version and Platform

Ubuntu 18.04, runner version 2.168.0

These are org-level runners but I imagine it's not specific to that.

Tingluo Huang · Answer 1 · Sat Apr 18 2020 01:44:24 GMT+0800 (China Standard Time)

@j3parker We should document this, basically, if you are using any container feature of GitHub Actions (container step or job container) you probably should run the runner as ROOT to avoid any potential file permission issue caused by files created by the container.

@chrispat thoughts?

Karan Thanvi · Answer 2 · Tue Apr 21 2020 15:27:11 GMT+0800 (China Standard Time)

@j3parker thanks for reporting this. I am also facing the same issue.
@TingluoHuang I thought to start the runner as root but the utility run.sh has a check to not start runner as root. https://github.com/actions/runner/blob/master/src/Misc/layoutroot/run.sh#L3

# Validate not sudo
user_id=`id -u`
if [ $user_id -eq 0 -a -z "$RUNNER_ALLOW_RUNASROOT" ]; then
    echo "Must not run interactively with sudo"
    exit 1
fi

EDIT : @j3parker I think you can simply achieve this by exporting the variable RUNNER_ALLOW_RUNASROOT=1. Check here.
Thank you @TingluoHuang

Chris Patterson · Answer 3 · Tue Apr 21 2020 21:41:39 GMT+0800 (China Standard Time)

If you are going to use container based actions I would recommend you use a job container as well. Mixing container and host environments does not work very well. The other option is to add a step to change permissions of files after the container action runs.

Bryan MacFarlane · Answer 4 · Tue Apr 21 2020 21:49:39 GMT+0800 (China Standard Time)

Also, I don't think we can say must run runner as root since many folks run as systemd service and I don't think that will allow us to run as root.

Bryan MacFarlane · Answer 5 · Tue Apr 21 2020 21:53:49 GMT+0800 (China Standard Time)

@karancode the reason that's an envvar is there's many scenarios (service) where it won't work as the service ends up running as a user (the user configured or one specified). but there are some scenarios (like putting the runner in a container) where you want / need to explicitly run as root and that's why the override is there. We should formalize the run as root more and doc it better.

We need to keep this open and think a bit more on what the right solution is. If you want to do run as root for now and you don't run the runner as a systemd service, then make sure however you launch it, it calls runsvc.sh and not run.sh so it doesn't exit on updates.

Karan Thanvi · Answer 6 · Tue Apr 21 2020 22:14:10 GMT+0800 (China Standard Time)

@chrispat @bryanmacfarlane Thanks.
In my case, I am running runner inside containers, so I had no other option but to run it as root.. (I tried to change permissions of files after container actions but that way isn't feasible as there could be many different actions generating multiple files.. I also tried running it as a systemd service but it was just uglier).

I also faced the issue with run.sh exiting on updates(observed with minor version update but not with patch).
Thanks for the tip, I will check for runsvc.sh and how to use it.
PS : If there's any doc please share, would love you contrib. Thanks!

Jacob Parker · Answer 7 · Tue Apr 21 2020 22:17:53 GMT+0800 (China Standard Time)

If you are going to use container based actions I would recommend you use a job container as well. Mixing container and host environments does not work very well.

Cool - when you say job containers are you referring to this? I hadn't seen that before... I'll definitely try that!

We need to keep this open and think a bit more on what the right solution is.

Thanks! It looks like container: will mitigate this for us for now. If the runner itself messed around with user namespaces that might be able to solve this (they can be nested) but it might be a bunch of work...

If container: works for us we'd be interested in being able to configure our runners to only accept containerized jobs. I can open a feature request for this after testing it out.

Peter Murray · Answer 8 · Tue Jun 16 2020 23:09:50 GMT+0800 (China Standard Time)

Another option here is that when running inside a container that has the user as root, you can use the jobs.<jobid>.container.options directive to provide the --user uid:gid value of the user and group that the self-hosted action runner is running as.

The downside to this is that details of the actions runner environment is starting to leak into the workflows of your projects, which is less than ideal in larger companies.

Ken Cochrane · Answer 9 · Fri Jul 03 2020 07:47:29 GMT+0800 (China Standard Time)

This is hitting me as well, at first it was easier to avoid, but now that we have more and more actions it is getting harder.

Is there a way to have the runner clean up the work directory after the workflow is finished. If the runner isn’t running as root, then it probably can’t delete the directory. But it would be able to do that if it was running in a container.

Maybe there is a way, to turn on a clean up job that will run after the workflow is complete to delete the files. If that job is run in a container it should have access to delete everything.

This only works if you have Docker installed, but that is fine because I don’t think the issue happens without docker.

I guess I could add this to all of my workflows, but that doesn’t seem ideal, getting it added at the runner level makes things cleaner. Just an idea, not sure if it is a good one, but figured I would add it here and see what people think.

Jef LeCompte · Answer 10 · Tue Jul 28 2020 10:11:35 GMT+0800 (China Standard Time)

For all my private actions, I ended up putting USER 1000:1000 in all the Docker actions. Since there is only ever root and the user I created, the only valid options are 0 and 1000. That way, any files created within the Docker action that are persisted by the self-hosted runner, they have the correct user.

The other solution is to just run the runner as root by setting RUNNER_ALLOW_RUNASROOT=1.

Mérouane Achour · Answer 11 · Tue Aug 04 2020 04:54:45 GMT+0800 (China Standard Time)

in your workflow file use this, e.g:

  build:
    runs-on: self-hosted
    needs: [clone-repository]
    container:
      image: gradle:5.5.1-jdk11
      options: --user 1000
    ...
    ...

Jef LeCompte · Answer 12 · Tue Aug 04 2020 05:00:39 GMT+0800 (China Standard Time)

I suppose that will only work if you're using container vs a Docker action.

I don't believe an action has those concepts unless specifically written by the action. In which case, it would need to get updated in every action that uses Docker (extreme example). Unless there is a universal environment variable that masks files or sets file creation permissions. (I suppose I'm thinking something similar to UMASK here; not sure really.)

Mérouane Achour · Answer 13 · Tue Aug 04 2020 18:09:53 GMT+0800 (China Standard Time)

Yes for Docker actions (private or public) - User will need to modify the corresponding Dockerfile by indeed having USER 1000:1000 like you mentioned earlier.

The nice thing is that github actions are indirectly enforcing good practices - i.e do not run a Container as root.
I personally think that it is the user responsibility to set the right permissions there.

Concerning container to avoid the options: --user 1000 repetition, it would be nice to be able to define it like this:

defaults:
  runs-on: self-hosted
  container-user-id: 1000

xanantis · Answer 14 · Sat Aug 08 2020 21:37:24 GMT+0800 (China Standard Time)

I found this temporary workaround for Docker action: Fix for self-hosted runner

P.S: I updated script and moved it to repository README.md

Clement Oh · Answer 15 · Wed Aug 12 2020 20:18:13 GMT+0800 (China Standard Time)

You might not be able to run with sudo, but you can add user to the root group. That worked for me :)

sudo usermod -a -G root <USER_NAME>

Jef LeCompte · Answer 16 · Wed Aug 12 2020 23:02:31 GMT+0800 (China Standard Time)

You might not be able to run with sudo, but you can add user to the root group. That worked for me :)

sudo usermod -a -G root <USER_NAME>

Strange... I did this before and still gave me problems. I can try again in the future.

Thanks!

xanantis · Answer 17 · Thu Aug 13 2020 00:28:54 GMT+0800 (China Standard Time)

@jef You may try my solution. It is hacky, but it does work for me.

xanantis · Answer 18 · Thu Aug 13 2020 07:27:12 GMT+0800 (China Standard Time)

@jef I rewrote my script. Now it is possible to set docker's user option via env vars. You can find more details here: https://github.com/xanantis/docker-file-ownership-fix#to-fix-all-those-above-for-self-hosted-runner

Ivan Murashka · Answer 19 · Tue Dec 01 2020 00:20:02 GMT+0800 (China Standard Time)

Hey! I faced with the same proble. Have you guys found easy solution?

Malcolm Davis · Answer 20 · Tue Dec 01 2020 00:27:28 GMT+0800 (China Standard Time)

Hey! I faced with the same proble. Have you guys found easy solution?

This worked for me...

I had to remove manually the conflicted files, and then adding the user ID of the user we use for automation did the trick.

Peter Murray · Answer 21 · Wed Dec 02 2020 18:22:31 GMT+0800 (China Standard Time)

Setting the uid is not always possible depending upon the container, how it was built and internal permissions inside the container. It is a good approach most of the time, but there are edge cases and in some use cases where there are separate teams managing the runners and the workflows complications can arise.

I have created an action that can "reset" permissions on the workspace directories and files that would trip up a consecutive run and break at checkout; https://github.com/peter-murray/reset-workspace-ownership-action

It is still not a perfect solution can can be appended to the end of the workflow with minimal impact and overhead to the workflow until there is a longer term fix available.

Justin Derwee-Church · Answer 22 · Thu Dec 03 2020 06:55:19 GMT+0800 (China Standard Time)

Also, I don't think we can say must run runner as root since many folks run as systemd service and I don't think that will allow us to run as root.

Ran into this myself today due to running python in docker containers during a test step, creating root-owned pycache files. These files then break the next build when the runner attempts removal during the checkout step like OP.

Are there any issues with running the service as root utilizing the [user] param for install? I'm hosting a runner on ubuntu 20.04.1 and running:

sudo ./svc.sh install root
sudo ./svc.sh start

works well for me as a workaround for now.

Aditya Sharad · Answer 23 · Wed Dec 09 2020 03:09:46 GMT+0800 (China Standard Time)

+1. This has also come up where users are using github/codeql-action (for Code Scanning) or other Actions that write to runner.temp. In that case it's possible for the Action to write data to the temp directory that fails to be cleaned up at the start of a next run, because the container user doesn't have permission to delete it. Documenting the right practice of getting the users to match would be a good way to help identify and prevent this.

Albin Gustavsson · Answer 24 · Fri Jan 08 2021 00:08:49 GMT+0800 (China Standard Time)

+1 and the documentation for actions explicitly state that containers should be run as root: https://docs.github.com/en/free-pro-team@latest/actions/creating-actions/dockerfile-support-for-github-actions . Yet if you follow this, the builds will break, unless you're running they GitHub runner itself as root I suppose.

John Peacock · Answer 25 · Sat Jan 09 2021 00:17:46 GMT+0800 (China Standard Time)

Stating that containers should be run as root is unbelievable insecure and lazy and violates the entire concept of process separation. If I don't carefully audit all thirdparty actions I have a strong possibility of opening my network for unknown damage. That's simply unacceptable under any circumstances.

Merritt Krakowitzer · Answer 26 · Mon Jan 18 2021 23:21:37 GMT+0800 (China Standard Time)

Setting the uid is not always possible depending upon the container, how it was built and internal permissions inside the container. It is a good approach most of the time, but there are edge cases and in some use cases where there are separate teams managing the runners and the workflows complications can arise.

I have created an action that can "reset" permissions on the workspace directories and files that would trip up a consecutive run and break at checkout; https://github.com/peter-murray/reset-workspace-ownership-action

It is still not a perfect solution can can be appended to the end of the workflow with minimal impact and overhead to the workflow until there is a longer term fix available.

Thank you, this workaround action helped solve this problem for us.

Niek Palm · Answer 27 · Tue Jan 19 2021 00:24:30 GMT+0800 (China Standard Time)

Created a few weeks ago an example to run on ubuntu with rootless docker. Still testing the set up but it should avoid the root problem, since the docker mapping is fixed.

Frederik-Baetens · Answer 28 · Wed Feb 10 2021 22:04:41 GMT+0800 (China Standard Time)

I created a guide based on @npalm 's example on how to run the github actions runner with rootless docker: https://stackoverflow.com/questions/66137419/how-to-enable-non-docker-actions-to-access-docker-created-files-on-my-self-hoste

Frederik-Baetens · Answer 29 · Thu Feb 11 2021 00:54:38 GMT+0800 (China Standard Time)

I actually stumbled upon another error, it mostly seems to work, hence the guide, but the new v2 docker build push action which uses buildx fails with

buildx call failed with: error: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: write sysctl key net.ipv4.ping_group_range: write /proc/sys/net/ipv4/ping_group_range: invalid argument: unknown

Edit: filed issue here docker/build-push-action#292

Frederik-Baetens · Answer 30 · Thu Feb 11 2021 07:55:39 GMT+0800 (China Standard Time)

Managed to solve that by adding driver: docker:

      - uses: docker/setup-buildx-action@v1
        with:
          driver: docker

Everything now works just as on the ubuntu-latest runners for me! again thanks to npalm

Sivaram Konanki · Answer 31 · Tue Mar 16 2021 18:26:48 GMT+0800 (China Standard Time)

Cleaning up after every docker step, actually solves this issue.

    runs-on: self-hosted
    container:
      image: python:3.8
    steps:
    - uses: actions/checkout@v2
    - name: <DO STUFF WITH CODE>
       run: <DO STUFF WITH CODE>
    - name: if the above step failed
       if: ${{ failure() }}
       run: rm -rf ..?* .[!.]* *
    - name: clean
       run: rm -rf ..?* .[!.]* *

Sivaram Konanki · Answer 32 · Thu Mar 18 2021 11:49:17 GMT+0800 (China Standard Time)

@jef if you have a docker step that can fail, you can use the above trick as a workaround. Otherwise, always cleanup after every docker step.

Roman Voitenko · Answer 33 · Wed Apr 28 2021 16:35:17 GMT+0800 (China Standard Time)

Hi there,
I'm expecting different but related issue.
My workflow if not container based, but as soon I have single step based on dockerized action, like github/super-linter, all the files are owned by root after this step.
And all non-docker steps which runs after using runner user. So this is definitely a bug.
For self-hosted runners the solution is RUNNER_ALLOW_RUNASROOT=1.

Frederik-Baetens · Answer 34 · Wed Apr 28 2021 23:32:07 GMT+0800 (China Standard Time)

@rvoitenko thats not a different issue, thats literally this issue, and both my solution above with rootless docker, and your solution with runasroot will both work.

Roman Voitenko · Answer 35 · Mon May 03 2021 17:53:58 GMT+0800 (China Standard Time)

@Frederik-Baetens ok, but these solutions doesn't apply for managed runners when you don't have container: job, but only steps based on dockerized third-party actions. So something need to be patched on github managed runners.

Frederik-Baetens · Answer 36 · Mon May 03 2021 18:37:02 GMT+0800 (China Standard Time)

As far as I know no such problems exist on the managed runners because on the managed runners the actions run as root, thereby avoiding these ownership problems

Jacob Parker · Answer 37 · Mon May 03 2021 20:26:48 GMT+0800 (China Standard Time)

As far as I know no such problems exist on the managed runners because on the managed runners the actions run as root, thereby avoiding these ownership problems

They don't run as root, the run as a user named "runner".

Although this particularly impacts self-hosted runners that re-use workspaces (the only option until #510 is solved) it isn't really specific to self-hosted runners. Here's an example (not terribly realistic) workspace file that uses the GitHub-hosted runners:

on: push
jobs:
  repro:
    runs-on: ubuntu-latest
    steps:
      - run: whoami

      - uses: actions/checkout@v2

      # It should work a second time
      - uses: actions/checkout@v2

      - name: Run a container that outputs to foo/output-file and puts nasty permissions on foo
        uses: ./

      - name: Print permissions for all files
        run: ls -alFR || true

      # Fails to git clean the foo folder, tries to rm -rf the checkout and then fails
      - uses: actions/checkout@v2

Here's what happens if you run it:

In the "print permissions" step it prints out stuff like:

Run ls -alFR || true
.:
total 28
drwxr-xr-x 5 runner docker 4096 May  3 12:21 ./
drwxr-xr-x 3 runner docker 4096 May  3 12:21 ../
drwxr-xr-x 8 runner docker 4096 May  3 12:21 .git/
ls: cannot open directory './foo': Permission denied
drwxr-xr-x 3 runner docker 4096 May  3 12:21 .github/
-rw-r--r-- 1 runner docker   82 May  3 12:21 Dockerfile
-rw-r--r-- 1 runner docker  374 May  3 12:21 README.md
drwx------ 2 root   root   4096 May  3 12:21 foo/
...

(note the error descending into ./foo)

kuvaldini · Answer 38 · Thu Jun 17 2021 00:01:50 GMT+0800 (China Standard Time)

@j3parker folder foo looks like was created inside docker image with user root

Jacob Parker · Answer 39 · Thu Jun 17 2021 00:22:18 GMT+0800 (China Standard Time)

Yes, exactly. This is what will happen by default (either with hosted runners or self-hosted using the documented setup steps.)
Here is the Dockerfile for the repro.

Daemmon · Answer 40 · Sat Jun 26 2021 04:08:32 GMT+0800 (China Standard Time)

I ran into this issue but it could have easily been avoided if the docs were better. I followed instruction from https://docs.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners, which ultimately brought me to a URL https://github.com/**my-org**/**my-repo**/settings/actions/runners/new. On that screen the docs say:

# Create the runner and start the configuration experience
$ ./config.sh --url https://github.com/farmerstoyou/rails_app --token ABCD
# Last step, run it!
$ ./run.sh

The ./run.sh cannot be run as root, as many have pointed out. But there is another script in the same dir that WILL start the runner as a systemd service running under root. The above docs should add this:

# Run it as a systemd service:
$ ./svc.sh install
$ ./svc.sh start

6r6 · Answer 41 · Sat Aug 14 2021 09:51:21 GMT+0800 (China Standard Time)

@dhughesbc 🙏 Cool, working fine for me like magic, this should become "best practice".

Klaus Badelt · Answer 42 · Fri Aug 20 2021 08:02:15 GMT+0800 (China Standard Time)

./svc.sh start must be run as root, but that doesn't run the process as root - still as the current user. Issue still prevails.

Frederik-Baetens · Answer 43 · Fri Aug 20 2021 20:17:34 GMT+0800 (China Standard Time)

@klausbadelt just use rootless docker and you won't have any more problems, without having to run anything as root.

Guide here: https://stackoverflow.com/questions/66137419/how-to-enable-non-docker-actions-to-access-docker-created-files-on-my-self-hoste

Julius Berger · Answer 44 · Wed Sep 08 2021 12:57:35 GMT+0800 (China Standard Time)

I had the same problem in a workflow that builds and pushes a docker image. A trick I use to circumvent this problem is using a docker-in-docker approach. This may only work on self-hosted runners (where I used and tested the workflow so far). It mounts the runner's docker.sock and runs the docker commands in an own docker:stable container.
Example workflow:

jobs:
  build-push-base-image:
    name: Build and push the image
    runs-on: [ self-hosted, ubuntu ]
    container:
      image: docker:stable
      volumes:
        # Mount the docker sock for host's docker engine to be usable inside container
        - /var/run/docker.sock:/var/run/docker.sock
    defaults:
      run:
        # Force sh since bash is not supported in docker:stable
        shell: sh
    steps:
      - uses: actions/checkout@v2

      - name: Login to Docker Repo
        uses: docker/login-action@v1
        with:
          registry: ...
          username: ...
          password: ...

      - name: Build, tag and push the image
        run: |
          docker build -t ${{ env.DOCKER_IMAGE_NAME }}:${{ env.DOCKER_IMAGE_TAG }} .
          docker tag ${{ env.DOCKER_IMAGE_NAME }}:${{ env.DOCKER_IMAGE_TAG }} ${{ env.DOCKER_REPO }}/${{ env.DOCKER_ORG }}/${{ env.DOCKER_IMAGE_NAME }}:${{ env.DOCKER_IMAGE_TAG }}
          docker push ${{ env.DOCKER_REPO }}/${{ env.DOCKER_ORG }}/${{ env.DOCKER_IMAGE_NAME }}:${{ env.DOCKER_IMAGE_TAG }}

Alex Salt · Answer 45 · Thu Sep 09 2021 22:58:31 GMT+0800 (China Standard Time)

Similar problem here, using actions with global tooling like actions/setup-go@v2 can be run without container (from runner user) and from container in different jobs.
It can break file permissions and there's no easy way to fix permissions automatically after job is finished except doing it by hand

Sam · Answer 46 · Tue Sep 28 2021 09:56:22 GMT+0800 (China Standard Time)

Having the same issue when devs are building Docker images.

If Dockerfiles don't explicitly drop permissions and are using volumes to the host - files can be created with UID 0.

This causes Github's cleanup workflow to fail as it doesn't have permissions to delete root files.

For example note the .mypy_cache directory below:

$ pwd
/home/ec2-user/actions-runner/_work/my_cool_app/src/glue/app

$ ls -la
total 44
drwxr-xr-x 6 ec2-user ec2-user   200 Sep 27 23:46 .
drwxr-xr-x 3 ec2-user ec2-user    17 Sep 27 23:44 ..
-rw-r--r-- 1 ec2-user ec2-user   308 Sep 27 23:44 Dockerfile
drwxr-xr-x 4 ec2-user ec2-user    98 Sep 27 23:44 app
-rw-r--r-- 1 ec2-user ec2-user  2546 Sep 27 23:44 app_run.py
drwxr-xr-x 3 root     root        76 Sep 27 23:46 .mypy_cache

It would be great if Github had some global (organisation wide) defaults that get applied to all repos.

Rob Bos · Answer 47 · Wed Sep 29 2021 04:27:24 GMT+0800 (China Standard Time)

Or a post step for the checkout action that always cleans up at the end of the job that started it.

Sam · Answer 48 · Wed Sep 29 2021 08:02:43 GMT+0800 (China Standard Time)

@rajbos but if in the root owned files were created by a docker container running with a volume (bind mount) without correctly mapping the user to the host then the cleanup step may not have permissions to delete the file.

I suppose you could add a sudoers rule to allow rm -rf on any files under the worker path, but it would be nice if Github could make it easier to globally (org wide) configure defaults for all workflows.

Peter Murray · Answer 49 · Wed Sep 29 2021 15:54:33 GMT+0800 (China Standard Time)

@sammcj you can use the same "feature" that created the files to remove or chown them, i.e. a docker based action step, which is what this action step does; https://github.com/marketplace/actions/reset-workspace-ownership-action. It runs as root inside the container and resets the owner to the specified uid which would be that of the actions runner user for instance.

You can add that as a post step that will always run as the last step in your job. Yes it is far from a perfect solution, but gives you control whereby you do not need to over privilege the runner user account by default and you cannot influence or control these docker based actions containers easily.

Kurt von Laven · Answer 50 · Sat Oct 02 2021 17:58:04 GMT+0800 (China Standard Time)

For those of you who can use rootless Docker, ScribeMD offers the rootless-docker GitHub action. It has only been tested on ubuntu-20.04 so far in the interest of incremental progress, but seeing as it is very simple, I am optimistic that it will work on ubuntu-18.04. It technically has a race condition since it doesn't wait for the Docker daemon to be ready, but regardless that generally happens much faster than launching a new shell for the next step. If anyone has thoughts on how best to eliminate the race condition, I am all ears.

Jared Smartt · Answer 51 · Tue Dec 14 2021 04:28:40 GMT+0800 (China Standard Time)

Everyone's comments above are appreciated greatly. I prefer to run everything in a container to keep the build server's environment as clean as possible, and I ended up with the following at the start of my jobs section:

jobs:
  build_and_test:
    runs-on: [ self-hosted ]
    container:
      image: ubuntu

    steps:
      - name: Clean the workspace
        run: rm -rf $GITHUB_WORKSPACE/*

      - uses: actions/checkout@v2

      ...

If a job container is not specified, you can use this as a cleanup step:

- name: Clean the workspace
  uses: docker://alpine
  with:
    args: /bin/sh -c "rm -rf /github/workspace/.* || rm -rf /github/workspace/*"

A few additional thoughts:

Coming from drone.io, it's a bit disappointing that environment pollution is even something that needs solved manually with GH actions (with containers), but I don't think that starting the runner as root is the right solution; this likely isn't even an option in many enterprise environments.
The checkout action really ought to handle the cleanup. If it's running in a container because of a job container specification, it should have the same permissions to modify/delete the files as the commands that created them.
None of this would be an issue if a temporary docker volume was used instead of volume-mounting the workspace dir on the runner host. Any additional mounted volumes can be specified manually, but if you're running everything in a container, I'm not sure why you'd want them. The whole reason I want to run everything in a container is to start with a clean slate, not a workspace contaminated by other builds.
- The checkout action could even run in a container itself, preventing the need to have git 2.18+ installed on the runner. (The ability to specify a cert bundle would be critical for GHES customers though.)

Kurt von Laven · Answer 52 · Thu Dec 30 2021 08:08:14 GMT+0800 (China Standard Time)

The aforementioned race condition has been fixed in rootless-docker@0.1.1. I share @jsmartt's perspective that running the runner as root is less desirable. I would expect using rootless Docker to work on most self-hosted Linux images, and I believe the outcome is essentially the same as cleaning the workspace since the runner cleans up after itself when it has permission to do so.

Sohaib Mustafa · Answer 53 · Fri Mar 04 2022 02:11:51 GMT+0800 (China Standard Time)

You might not be able to run with sudo, but you can add user to the root group. That worked for me :)

sudo usermod -a -G root <USER_NAME>

This works for me as well

Dayananda DR · Answer 54 · Thu Jan 12 2023 18:53:13 GMT+0800 (China Standard Time)

sudo usermod -a -G root <USER_NAME>

Thank you :-) This worked for me.

Christian Proud · Answer 55 · Fri May 26 2023 16:52:40 GMT+0800 (China Standard Time)

Are people still having this issue? I've been able to fix my issue by doing what was suggested in #434 (comment) but it feels a bit gross to run the service as root.

My issue is that after running a black formatting action (https://github.com/psf/black) there are files leftover in the _actions dir that aren't owned by the runner's user but rather by root:

Access to the path '/home/ubuntu/actions-runner/_work/_actions/psf/black/stable/.black-env/lib/python3.10/site-packages/black-23.3.0.dist-info/INSTALLER' is denied.) (Access to the path '/home/ubuntu/actions-runner/_work/_actions/psf/black/stable/.black-env/pyvenv.cfg' is denied.) 

(Access to the path '/home/ubuntu/actions-runner/_work/_actions/psf/black/stable/.black-env/lib64' is denied.) (Access to the path '/home/ubuntu/actions-runner/_work/_actions/psf/black/stable/.black-env/lib/python3.10/site-packages/mypy_extensions.py' is denied.)

this causes subsequent actions to fail with the above message.

Sam · Answer 56 · Fri May 26 2023 21:57:37 GMT+0800 (China Standard Time)

Yep, still happens with a few different clients of mine.

Kurt von Laven · Answer 57 · Sat May 27 2023 04:25:20 GMT+0800 (China Standard Time)

Instead of running the service as root, you can use ScribeMD/rootless-docker to run Docker in rootless mode.

Marc Herbert · Answer 58 · Sun Jul 16 2023 15:20:40 GMT+0800 (China Standard Time)

Not 100% sure this is relevant sorry but to work around what I think is a similar problem, I wrote a small script that dynamically creates a user inside the container with the ID that matches the user ID outside: https://stackoverflow.com/a/74330112

In that case the problem solved is: sharing inside files outside.

It's ugly but surprisingly effective at solving what seems to be a docker design issue.

Christian Proud · Answer 59 · Wed Aug 09 2023 09:28:28 GMT+0800 (China Standard Time)

If anyone is having the problem I mentioned above with black, I've actually had a PR just merged that will remove all files created by black during the action 👍

Brian Kiplagat · Answer 60 · Fri Dec 22 2023 00:07:28 GMT+0800 (China Standard Time)

facing this issue with a deployment workflow

Brian Kiplagat · Answer 61 · Fri Dec 22 2023 00:08:04 GMT+0800 (China Standard Time)

facing this issue with a deployment workflow

im on centos 7 as root user