Issues with `install_pkgs`'s parameter `output_image_name` in multiple image builds
lukokr-aarch64 opened this issue Β· comments
π bug report
Affected Rule
The issue is caused by the rule: `install_pkgs`Is this a regression?
No.
Description
We have found a potenital race condition in a large repository when output_image_name
is not unique.
When running in parallel the following is prone to race conditions:
reset_cmd $image_id $cid %{output_image_name} # another target overwrites %{output_image_name}
"$DOCKER" $DOCKER_FLAGS save %{output_image_name} > %{output_file_name} # wrong target is saved
I understand that the user is expected to know that output_image_name
is the unqiue name of the image as loaded into the docker runtime, however this is not stated clearly and as such in large codebases with many images it is possible that duplication will occur.
I would propose that output_image_name
should be made optional, and when not specified an image name based on the label path will be created similiar to the intermediate container used for container_test
. Furthermore, if the image is not expected to be used, would you consider removing this from the API and deleteing the image to reduce bloat in the docker runtime?
I have a WIP patch I am happy to put up if you agree with the above. Please let me know π
π¬ Minimal Reproduction
This is a bit tricky to reproduce as it's timing related. I have uploaded a patch which has reproduced this issue with a hack to exacerbate the issue.
repro.patch
I have added a random sleep to highlight the race condition:
reset_cmd $image_id $cid %{output_image_name}
sleep $(( ($RANDOM % 4 * 4) + 1 ))s
"$DOCKER" $DOCKER_FLAGS save %{output_image_name} > %{output_file_name}
This of course only happens when output_image_name
is not unique.
π₯ Exception or Error
When the race bug happens, the wrong image is saved, for the above patch it results in errors such as:
==================================
====== Test file: wget.yaml ======
==================================
=== RUN: Command Test: wget
--- FAIL
duration: 1.169700321s
Error: Error creating container: API error (400): failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "wget": executable file not found in $PATH: unknown
π Your Environment
Operating System:
Ubuntu 20.04.5 LTS
Output of bazel version
:
bazel 4.0.0
I am using bazelisk
Rules_docker version:
master@48ad6d6
Anything else relevant?
Closing since #2202 was merged, happy to discuss further changes in this issue at a later time.