How to create deterministic layers?
njlr opened this issue Β· comments
π bug report
Affected Rule
The issue is caused by the rule:container_run_and_commit_layer
container_image
(maybe)
Is this a regression?
Unsure
Description
When building a container_run_and_commit_layer
target multiple times, the hash is not deterministic.
However, container-diff
shows no differences at a file-level.
π¬ Minimal Reproduction
https://github.com/njlr/bazel-run-commit
WORKSPACE
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
http_archive(
name = "bazel_skylib",
urls = [
"https://mirror.bazel.build/github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
"https://github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
],
sha256 = "f7be3474d42aae265405a592bb7da8e171919d74c16f082a5457840f06054728",
)
load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace")
bazel_skylib_workspace()
http_archive(
name = "io_bazel_rules_docker",
sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)
load(
"@io_bazel_rules_docker//repositories:repositories.bzl",
container_repositories = "repositories",
)
container_repositories()
load("@io_bazel_rules_docker//repositories:deps.bzl", container_deps = "deps")
container_deps()
load(
"@io_bazel_rules_docker//container:container.bzl",
"container_pull",
)
container_pull(
name = "dotnet_runtime_deps_6_0_10",
registry = "mcr.microsoft.com",
repository = "dotnet/runtime-deps",
tag = "6.0.10-bullseye-slim-amd64",
digest = "sha256:24554fadd483d8305974ded44bb1dbe4916e2f02500b9e2d78e7beb557cfebd0"
)
BUILD.bazel
load("@io_bazel_rules_docker//container:container.bzl", "container_image")
load("@io_bazel_rules_docker//docker/util:run.bzl", "container_run_and_commit_layer")
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")
container_run_and_commit_layer(
name = "install_git",
image = "@dotnet_runtime_deps_6_0_10//image",
commands = [
" && ".join([
"apt-get update -y",
"apt-get install -y git=1:2.30.2-1",
"apt-get clean",
"rm -rf /var/lib/apt/lists/*",
"rm -rf /var/cache/ldconfig/aux-cache",
"rm -rf /var/log/alternatives.log",
"rm -rf /var/log/apt/term.log",
"rm -rf /var/log/apt/history.log",
"rm -rf /var/log/dpkg.log",
"rm -rf /var/log/*",
"rm -rf /var/cache/debconf/templates.dat",
"rm -rf /var/lib/dpkg/status-old",
"rm -rf /var/lib/dpkg/status",
"rm -rf /var/cache/debconf/config.dat",
"rm -rf /etc/ld.so.cache",
"rm -rf /var/lib/apt/extended_states",
"rm -rf /var/log/apt/eipp.log.xz",
"git --version",
]),
],
)
container_image(
name = "image",
base = "@dotnet_runtime_deps_6_0_10//image",
layers = [
":install_git",
],
)
copy_file(
name = "image_archive",
src = ":image.tar",
out = "image_archive.tar",
is_executable = False,
allow_symlink = False,
)
test.sh
#!/bin/bash
set -e
set -o pipefail
rm -rf ./bazel-*
bazel clean
bazel build //:image_archive
sha256sum bazel-bin/image_archive.tar
rm -rf ./bazel-*
bazel clean
bazel build //:image_archive
sha256sum bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
bazel-bin/image_archive.tar
INFO: Elapsed time: 42.970s, Critical Path: 42.10s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
d51cbfa26560fe671e13655b0baa94a3d8426b4cc3a8726c2e4a2e05585ebc6b bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
bazel-bin/image_archive.tar
INFO: Elapsed time: 60.050s, Critical Path: 59.20s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
3b80585ed7dcf7f27590e48bb48b89d59ce6a1660f6ced7f081711c5e64fd064 bazel-bin/image_archive.tar
π₯ Exception or Error
N/A
π Your Environment
Operating System:
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04 LTS
Release: 22.04
Codename: jammy
Output of bazel version
:
bazel --version
bazel 5.3.1
Rules_docker version:
http_archive(
name = "io_bazel_rules_docker",
sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)
Anything else relevant?
Nope
Curiously, this seems to work:
container_image(
name = "image",
base = "@dotnet_runtime_deps_6_0_10//image",
layers = [
- ":install_git",
],
+ tars = [
+ ":install_git",
+ ],
)
Also strange is that the hash on GitHub CI and my machine differ.
You call tools in your container which aren't hermetic, like apt-get install
- so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.
You call tools in your container which aren't hermetic, like
apt-get install
- so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.
There are commands to clean up the noise from apt-get
(although it is possible something was missed). It appears to be deterministic when using tars
but not layers
.
This fix also seems to improve remote cacheability, and may help solve #2195.