Avoid Post Cache-phase when key is already in cache

Question

Avoid Post Cache-phase when key is already in cache

pjonsson opened this issue 2 years ago · comments

It would be nice if the post-cache phase checked that the cache key is still missing before it starts the slow operations. This will speed up workflows where runners have different run times (1 min vs 10 min), and in scenarios where some runners have a delayed start (not sure why runner starts are sometimes staggered, might be related to the load on the Github CI).

Kurt von Laven · Answer 1 · Tue Sep 06 2022 00:54:45 GMT+0800 (China Standard Time)

It's more manual than automatic, so I can see the value of the proposed change, but folks can pass read-only: false to all but one job in such a scenario. I still think the better solution in such a situation would be to avoid concurrently performing expensive operations and to only perform them once. My understanding is that runners are allocated as they become available, which introduces some randomness to their start times.

Kurt von Laven · Answer 2 · Tue Sep 06 2022 04:18:57 GMT+0800 (China Standard Time)

Unfortunately, the official GitHub cache action currently makes this feature request infeasible to implement. There is presently no exposed method for efficiently querying the cache. I upvoted actions/cache#321, a feature request to expose a method to test whether a key is cached without also restoring the cache, and recommend you do the same. The feature is currently on their roadmap, but there is no public timeline. Please feel free to post back here once the upstream cache action makes this request feasible.

Peter A. Jonsson · Answer 3 · Tue Sep 06 2022 07:23:11 GMT+0800 (China Standard Time)

The issue you linked is closed as completed, despite only being on the roadmap. I'm surprised how they manage their backlog, it must be hard to realize that closed tickets require resources. Here's an open issue that I upvoted:
actions/cache#420

I guess your trick from #44 might be usable in this scenario as well, the post check starts by caching a 0-byte file under a related key name, and the runner that succeeds with that also caches the docker images.

Kurt von Laven · Answer 4 · Tue Sep 06 2022 07:31:25 GMT+0800 (China Standard Time)

That is a pull request, but I upvoted it as well. For context, the other pertinent issue I found was closed as a duplicate of the one I linked.

Good point that correlated caching can help here.

Kurt von Laven · Answer 5 · Thu Feb 16 2023 07:13:53 GMT+0800 (China Standard Time)

GitHub released a REST API for caches that makes this feature feasible to implement. I am not sure how valuable this feature is though since I expect the common case here would be that no job has finished saving the cache before the other jobs have all raced past the check to see whether the cache already exists.

Peter A. Jonsson · Answer 6 · Thu Feb 16 2023 20:08:30 GMT+0800 (China Standard Time)

I'm not trying to bump the priority of this, but I can provide some context: with 17 jobs, it's not that hard to get into the situation where a some jobs have to wait for a runner. If the job runtimes are somewhat balanced, it's usually the late started jobs that are blocking the merge.

I don't have any experience with 5-10 jobs, so I can't comment on how common the problem is for that scenario.

Kurt von Laven · Answer 7 · Thu Feb 16 2023 20:58:58 GMT+0800 (China Standard Time)

I see your point. I don't know how this varies by plan, but the free plans have 20 GitHub-hosted runners, so once that limit is reached, then this optimization becomes more valuable.

Kurt von Laven · Answer 8 · Mon Nov 27 2023 15:30:52 GMT+0800 (China Standard Time)

@actions/cache added a lookupOnly option to restoreCache in v3.2.0 making this feature far easier to implement in JavaScript than it previously was through the REST API.