bazelbuild / rules_docker

Rules for building and handling Docker images with Bazel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issues with `install_pkgs`'s parameter `output_image_name` in multiple image builds

lukokr-aarch64 opened this issue Β· comments

🐞 bug report

Affected Rule

The issue is caused by the rule: `install_pkgs`

Is this a regression?

No.

Description

We have found a potenital race condition in a large repository when output_image_name is not unique.
When running in parallel the following is prone to race conditions:

reset_cmd $image_id $cid %{output_image_name} # another target overwrites %{output_image_name} 
"$DOCKER" $DOCKER_FLAGS save %{output_image_name} > %{output_file_name} # wrong target is saved

I understand that the user is expected to know that output_image_name is the unqiue name of the image as loaded into the docker runtime, however this is not stated clearly and as such in large codebases with many images it is possible that duplication will occur.

I would propose that output_image_name should be made optional, and when not specified an image name based on the label path will be created similiar to the intermediate container used for container_test. Furthermore, if the image is not expected to be used, would you consider removing this from the API and deleteing the image to reduce bloat in the docker runtime?

I have a WIP patch I am happy to put up if you agree with the above. Please let me know πŸ‘

πŸ”¬ Minimal Reproduction

This is a bit tricky to reproduce as it's timing related. I have uploaded a patch which has reproduced this issue with a hack to exacerbate the issue.
repro.patch

I have added a random sleep to highlight the race condition:

reset_cmd $image_id $cid %{output_image_name}
sleep $(( ($RANDOM % 4 * 4) + 1 ))s
"$DOCKER" $DOCKER_FLAGS save %{output_image_name} > %{output_file_name}

This of course only happens when output_image_name is not unique.

πŸ”₯ Exception or Error

When the race bug happens, the wrong image is saved, for the above patch it results in errors such as:


==================================
====== Test file: wget.yaml ======
==================================
=== RUN: Command Test: wget
--- FAIL
duration: 1.169700321s
Error: Error creating container: API error (400): failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "wget": executable file not found in $PATH: unknown

🌍 Your Environment

Operating System:

  
Ubuntu 20.04.5 LTS
  

Output of bazel version:

  
bazel 4.0.0
  

I am using bazelisk

Rules_docker version:

  
master@48ad6d6
  

Anything else relevant?

Closing since #2202 was merged, happy to discuss further changes in this issue at a later time.