ramsey / composer-install

:gift: A GitHub Action to streamline installation of PHP dependencies with Composer.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cache restore race condition?

mpdude opened this issue Β· comments

Hey there πŸ‘‹πŸ» , thank you for this action! It makes my workflows (and thus) life much easier than having to deal with Composer and actions/cache myself.

However, I did a few test runs and noticed that I get failures every now and then, let's say in 1 out of 5 workflow runs. Those failures are transient and most of the time go away if I run the workflow again.

The bottom line error message is

2021-06-09T05:51:45.7317484Z ##[error]Failed to execute git clone --mirror -- 'ssh://git@github.com/MYORG/MYORG-SOME-PRIVATE-REPO' '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO/'

Cloning into bare repository '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO'...
BUG: refs/files-backend.c:2953: initial ref transaction called with existing refs
Aborted (core dumped)

Unfortunately, the Composer dependency in question is for a private repo that will be git-cloned as source, and I can't share all the details. But here is an excerpt from the raw logs with what I think is relevant.

2021-06-09T05:51:23.2825442Z ##[group]Operating System
2021-06-09T05:51:23.2827149Z Ubuntu
2021-06-09T05:51:23.2828278Z 18.04.5
2021-06-09T05:51:23.2829465Z LTS
2021-06-09T05:51:23.2830596Z ##[endgroup]
2021-06-09T05:51:23.2831828Z ##[group]Virtual Environment
2021-06-09T05:51:23.2833168Z Environment: ubuntu-18.04
2021-06-09T05:51:23.2834390Z Version: 20210531.0
2021-06-09T05:51:23.2837061Z Included Software: https://github.com/actions/virtual-environments/blob/ubuntu18/20210531.0/images/linux/Ubuntu1804-README.md
2021-06-09T05:51:23.2839459Z Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu18%2F20210531.0
2021-06-09T05:51:23.2841153Z ##[endgroup]
...
2021-06-09T05:51:40.1782268Z ##[group]Run ramsey/composer-install@v1
2021-06-09T05:51:40.1783292Z with:
2021-06-09T05:51:40.1784197Z   dependency-versions: locked
2021-06-09T05:51:40.1785094Z env:
2021-06-09T05:51:40.1785850Z   APP_ENV: test
2021-06-09T05:51:40.1786845Z   PHP_VERSION: 7.2
2021-06-09T05:51:40.1787798Z   SSH_AUTH_SOCK: /tmp/ssh-MZQQ7o4xWr9F/agent.2587
2021-06-09T05:51:40.1789259Z   SSH_AGENT_PID: 2588
2021-06-09T05:51:40.1790655Z   TEMP: /home/runner/work/MYREPO/MYREPO/tmp
2021-06-09T05:51:40.1793996Z   COMPOSER_AUTH: ***
2021-06-09T05:51:40.1794829Z ##[endgroup]
2021-06-09T05:51:40.5953452Z Composer cache directory found at /home/runner/.cache/composer
2021-06-09T05:51:40.5977780Z Composer working directory set to project root.
2021-06-09T05:51:40.5981742Z Using locked versions of dependencies
2021-06-09T05:51:40.6680768Z PHP version is 7.2.34-21+ubuntu18.04.1+deb.sury.org+1
2021-06-09T05:51:40.6682462Z Operating system is linux
2021-06-09T05:51:40.6686362Z Cache primary key is linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-82ffafb70b7e4a023b19b2e8ef1eef08eb2e61b31dd27e40b751819a7f8331fe-
2021-06-09T05:51:40.6693769Z Cache restore keys are: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-82ffafb70b7e4a023b19b2e8ef1eef08eb2e61b31dd27e40b751819a7f8331fe-, linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-
2021-06-09T05:51:40.7612805Z [command]/usr/local/bin/composer install --no-interaction --no-progress --ansi
2021-06-09T05:51:40.9632982Z  Installing dependencies from lock file (including require-dev) 
2021-06-09T05:51:40.9751962Z  Verifying lock file contents can be installed on current platform. 
2021-06-09T05:51:42.4208308Z Received 0 of 49209747 (0.0%), 0.0 MBs/sec
2021-06-09T05:51:42.6212086Z  Package operations: 186 installs, 0 updates, 0 removals 
2021-06-09T05:51:42.6217458Z   - Downloading  composer/package-versions-deprecated  ( 1.11.99.2 )
2021-06-09T05:51:42.6228500Z   - Downloading  symfony/polyfill-mbstring  ( v1.23.0 )
... a lot of "Downloading ..." dist files ...
2021-06-09T05:51:42.6460638Z   - Syncing  MYORG/SOME-PRIVATE-REPO  ( 3.1.0 ) into cache
2021-06-09T05:51:43.0227542Z Received 49209747 of 49209747 (100.0%), 29.3 MBs/sec
2021-06-09T05:51:43.0228762Z Cache Size: ~47 MB (49209747 B)
2021-06-09T05:51:43.0267378Z [command]/bin/tar --use-compress-program zstd -d -xf /home/runner/work/_temp/1c4b8074-9a05-4165-8e57-9b97c3c0e543/cache.tzst -P -C /home/runner/work/MYREPO/MYREPO
2021-06-09T05:51:43.1632026Z Cache restored successfully
2021-06-09T05:51:43.1746121Z Cache restored from key: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-
2021-06-09T05:51:45.7317484Z ##[error]Failed to execute git clone --mirror -- 'ssh://git@github.com/MYORG/MYORG-SOME-PRIVATE-REPO' '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO/'

Cloning into bare repository '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO'...
BUG: refs/files-backend.c:2953: initial ref transaction called with existing refs
Aborted (core dumped)

2021-06-09T05:51:45.7333789Z 
2021-06-09T05:51:45.7538921Z  [37;41m                                                                                                                                                                                            [39;49m
2021-06-09T05:51:45.7541852Z  [37;41m  [RuntimeException]                                                                                                                                                                        [39;49m
2021-06-09T05:51:45.7545887Z  [37;41m  Failed to execute git clone --mirror -- 'ssh://git@github.com/MYORG/MYORG-SOME-PRIVATE-REPO' '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO/'   [39;49m
2021-06-09T05:51:45.7549415Z  [37;41m                                                                                                                                                                                            [39;49m
2021-06-09T05:51:45.7552153Z  [37;41m  Cloning into bare repository '/home/runner/.cache/composer/vcs/ssh---git-github.com-MYORG-MYORG-SOME-PRIVATE-REPO'...                                                                   [39;49m
2021-06-09T05:51:45.7556156Z  [37;41m  BUG: refs/files-backend.c:2953: initial ref transaction called with existing refs                                                                                                         [39;49m
2021-06-09T05:51:45.7558622Z  [37;41m  Aborted (core dumped)                                                                                                                                                                     [39;49m
2021-06-09T05:51:45.7561683Z  [37;41m                                                                                                                                                                                            [39;49m

What strikes me is the way how

2021-06-09T05:51:43.1632026Z Cache restored successfully
2021-06-09T05:51:43.1746121Z Cache restored from key: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-

happens just before the error, and how that is interleaved with the actual Composer run.

So, my suspicion is that some async operations go wrong here, so that the cache is restored while Composer is already running. Maybe that does not make much of a difference for dist files, but might be enough to make git stumble when it runs on a partly-restored repo?

I should also add that I have some concurrent workflows that run this action here at about the same time. But to my knowledge, different workflows run in isolated environments. So I would rule out any kind of cross-workflow race conditions.

I am sure this is the race condition as I initially described. Here is the log output of another run that worked:

PHP version is 7.2.34-21+ubuntu18.04.1+deb.sury.org+1
Operating system is linux
Cache primary key is linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-
Cache restore keys are: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-, linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-
/usr/local/bin/composer install --no-interaction --no-progress --ansi
Installing dependencies from lock file (including require-dev)
Verifying lock file contents can be installed on current platform.
Received 49209747 of 49209747 (100.0%), 123.8 MBs/sec
Cache Size: ~47 MB (49209747 B)
/bin/tar --use-compress-program zstd -d -xf /home/runner/work/_temp/bba28022-b2f5-403f-9ceb-305964208faf/cache.tzst -P -C /home/runner/work/REPO/REPO
Cache restored successfully
Cache restored from key: linux-php-7.2.34-21+ubuntu18.04.1+deb.sury.org+1-locked-bac6c14ffff580a04a4fea1394e68aa862bb682755fbc9d2277682591500a408-
Package operations: 186 installs, 0 updates, 0 removals
  - Syncing webfactory/wfdynamic (3.1.0) into cache

You can see that /usr/local/bin/composer install --no-interaction --no-progress --ansi comes before Cache restored successfully, but this time, the cache was quick enough to be restored before Composer did any git clone operations.

I have no experience with TypeScript or asynchronous programming in JavaScript.

That being said, here is the code from https://github.com/actions/cache that you require() in src/cache/factory.ts:

https://github.com/actions/cache/blob/0638051e9af2c23d10bb70fa9beffcad6cff9ce3/src/restore.ts#L69-L71

The run() call seems to trigger the actual cache restore code; but since that is async, it might take a while to complete. Does run() return a promise in this case, but that promise is never exported from the module?

If that is the case, what does this code here

return async () => await require('cache/dist/restore.js')
wait for?

I have no experience with TypeScript or asynchronous programming in JavaScript.

My experience is also limited, but I think calling attention to the fact that there's absolutely asynchronous operations going on here points to a likely smoking gun.

In the code you pointed out, I thought the await told it to wait, but that does seem in conflict with the async keyword. I'll check with some folks who know more about how JS works so I can get this to be more synchronous.

Thanks!

Awesome. IMHO the run() in the restore.js file from actions/Core looks suspicious, since it seems to return a promise that we never get hold of.

That might be very difficult to fix, since it would require tweaking upstream code.

Looking forward to hearing from you πŸ‘¨πŸ»β€πŸ’»

we are also running in spurious errors which look like race-conditions to me. any progress on this front?

This is fixed in v2, please upgrade using ramsey/composer-install@v2 or ramsey/composer-install@2.0.0. Details are available on the release announcement and the README.

Thanks!