actions / runner-images

GitHub Actions runner images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

System.IO.IOException: No space left on device

clementrx opened this issue · comments

Description

Hi, for 2 days, i encounter the following problem :

System.IO.IOException: No space left on device : '/home/runner/runners/2.314.1/_diag/Worker_20240312-083344-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)
System.IO.IOException: No space left on device : '/home/runner/runners/2.314.1/_diag/Worker_20240312-083344-utc.log'
   at System.IO.RandomAccess.WriteAtOffset(SafeFileHandle handle, ReadOnlySpan`1 buffer, Int64 fileOffset)
   at System.IO.Strategies.BufferedFileStreamStrategy.FlushWrite()
   at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
   at System.Diagnostics.TextWriterTraceListener.Flush()
   at GitHub.Runner.Common.HostTraceListener.WriteHeader(String source, TraceEventType eventType, Int32 id)
   at GitHub.Runner.Common.HostTraceListener.TraceEvent(TraceEventCache eventCache, String source, TraceEventType eventType, Int32 id, String message)
   at System.Diagnostics.TraceSource.TraceEvent(TraceEventType eventType, Int32 id, String message)
   at GitHub.Runner.Common.Tracing.Error(Exception exception)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)```



### Platforms affected

- [ ] Azure DevOps
- [X] GitHub Actions - Standard Runners
- [ ] GitHub Actions - Larger Runners

### Runner images affected

- [ ] Ubuntu 20.04
- [X] Ubuntu 22.04
- [ ] macOS 11
- [ ] macOS 12
- [ ] macOS 13
- [ ] macOS 13 Arm64
- [ ] macOS 14
- [ ] macOS 14 Arm64
- [ ] Windows Server 2019
- [ ] Windows Server 2022

### Image version and build link

 Image: ubuntu-22.04
 Version: 20240304.1.0

### Is it regression?

 Image: ubuntu-22.04   Version: 20240218.1.0

### Expected behavior

It's a script that send mail automatically everyday

### Actual behavior

Runner stopping before to run the script

### Repro steps

```name: Preds today

on:
 schedule:
   - cron: '00 08 * * *'
   # push:
    # branches: main

env:
  RENV_PATHS_ROOT: ~/.local/share/renv

jobs:
  predictions:
    runs-on: ubuntu-latest
    timeout-minutes: 120
    steps:

      - name: Set Swap Space
        uses: pierotofy/set-swap-space@master
        with:
          swap-size-gb: 10

      - name: Checkout repos
        uses: actions/checkout@v3

      - name: Set up R
        uses: r-lib/actions/setup-r@v2

      - name: Install Miniconda and dependencies
        run: |
          sudo wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh && \
          sudo /bin/bash ~/miniconda.sh -b -p /opt/conda

          # Put conda in path so we can use conda activate
          export PATH=/opt/conda/bin:$PATH

          sudo apt-get update && \
          sudo apt-get install -y build-essential libssl-dev zlib1g-dev libbz2-dev \
          libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev \
          libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl git

      - name: Install libcurl
        run: sudo apt-get install -y libcurl4-openssl-dev

      - name: Set up pandoc
        uses: r-lib/actions/setup-pandoc@v2

      - name: Set up quarto
        uses: quarto-dev/quarto-actions/setup@v2

      - name: Cache packages
        uses: actions/cache@v3
        with:
          path: ${{ env.RENV_PATHS_ROOT }}
          key: ${{ runner.os }}-renv-${{ hashFiles('**/renv.lock') }}
          restore-keys: |
            ${{ runner.os }}-renv-

      - name: Set up Renv
        uses: r-lib/actions/setup-renv@v2


#       - name: Restore packages
      #   shell: Rscript {0}
      #   run: |
      #     if (!requireNamespace("renv", quietly = TRUE)) install.packages("renv")
      #     renv::restore()

      - name: Install TensorFlow and Keras
        env:
          HUB_PROJECT_UID: ${{ secrets.HUB_PROJECT_UID }}

        run: |
          Rscript -e 'remotes::install_github("rstudio/tensorflow", auth_token = Sys.getenv("HUB_PROJECT_UID"))'
          Rscript -e 'reticulate::install_python(); library(tensorflow); install_tensorflow(envname = "r-tensorflow")'
          Rscript -e 'install.packages("keras"); library(keras); install_keras()'

      - name: Run main
        run: Rscript -e 'source("main.R")'```

Hello @clementrx!
Based on the workflow file provided, it looks like you are checking out some repository. Could you please provide us with a link to the public repository where the issue occurred for further investigation?

Unfortunately it's a private repository.

@clementrx do I suspect right that the size of your project (be it just checked out files or something you build on the runner + checked out files) exceeds 14GB?

No, all of my project is maximum 10GB.

we would need you to assist us in providing a minimal repro steps to reproduce the problem and solve it.

Having a similar issue today in one of our projects. No change was brought to the project itself (no code change, no CI workflow change). It used to pass, it now fails with [Errno 28] No space left on device.
I cannot share more, private repo. I feel that it could be related to the issue described above.

My project consists of predicting horse races, it downloads from a Google drive a .zip containing a .sqlite base (around 2GB), updates the file, then makes predictions and sends it to me by email. It has been working very well for 3-4 months, until this week.
I will try other test in the next days.

I'm also running into this issue (in a private repo) and for me the issue started on Thursday, March 7 2024.
In an ssh session I was able to confirm low available disk space:

root@2e7b42af9e08:/# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          73G   73G  170M 100% /
tmpfs            64M     0   64M   0% /dev
shm              64M     0   64M   0% /dev/shm
/dev/root        73G   73G  170M 100% /__t
tmpfs           1.6G  1.3M  1.6G   1% /run/docker.sock
tmpfs           3.9G     0  3.9G   0% /proc/acpi
tmpfs           3.9G     0  3.9G   0% /proc/scsi
tmpfs           3.9G     0  3.9G   0% /sys/firmware

I've also noticed that sudo is not installed, though I would expect it to be as part of the ubuntu latest image.

All spec runs are failing with no code changes.

Hello @clementrx!
Could you please add the df -h command to your workflow to check available disk space and share the results with us?

@clementrx could you please share with us links to the failed build and to the build in which you executed the df -h command, even if they lead to a private repository?

Unfortunately, I cannot share my private repos...
I will try to reproduce this into a public repos this weekend

Today the script has stopped with this message :

The hosted runner: GitHub Actions 14 lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

I "solved" the problem by building my own image with a Dockerfile (which is faster).

I believe this issue is related to actions/runner#3184

Closing as non-actable for us at the moment, as per the docs we only guarantee 14GB of free disk space, even though the size of free disk space may vary from time to time, so please make sure your checkout projects and resulting files fit that size, there is a workaround with additional /mnt usage at the moment, but the /dev/root partition remains "at least" (more is possible but not less) 14GB free.