bazelbuild / rules_docker

Rules for building and handling Docker images with Bazel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to create deterministic layers?

njlr opened this issue Β· comments

commented

🐞 bug report

Affected Rule

The issue is caused by the rule:
  • container_run_and_commit_layer
  • container_image (maybe)

Is this a regression?

Unsure

Description

When building a container_run_and_commit_layer target multiple times, the hash is not deterministic.

However, container-diff shows no differences at a file-level.

πŸ”¬ Minimal Reproduction

https://github.com/njlr/bazel-run-commit

WORKSPACE

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
  name = "bazel_skylib",
  urls = [
    "https://mirror.bazel.build/github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
    "https://github.com/bazelbuild/bazel-skylib/releases/download/1.2.1/bazel-skylib-1.2.1.tar.gz",
  ],
  sha256 = "f7be3474d42aae265405a592bb7da8e171919d74c16f082a5457840f06054728",
)

load("@bazel_skylib//:workspace.bzl", "bazel_skylib_workspace")

bazel_skylib_workspace()

http_archive(
  name = "io_bazel_rules_docker",
  sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
  urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)

load(
    "@io_bazel_rules_docker//repositories:repositories.bzl",
    container_repositories = "repositories",
)
container_repositories()

load("@io_bazel_rules_docker//repositories:deps.bzl", container_deps = "deps")

container_deps()

load(
  "@io_bazel_rules_docker//container:container.bzl",
  "container_pull",
)

container_pull(
  name = "dotnet_runtime_deps_6_0_10",
  registry = "mcr.microsoft.com",
  repository = "dotnet/runtime-deps",
  tag = "6.0.10-bullseye-slim-amd64",
  digest = "sha256:24554fadd483d8305974ded44bb1dbe4916e2f02500b9e2d78e7beb557cfebd0"
)

BUILD.bazel

load("@io_bazel_rules_docker//container:container.bzl", "container_image")
load("@io_bazel_rules_docker//docker/util:run.bzl", "container_run_and_commit_layer")
load("@bazel_skylib//rules:copy_file.bzl", "copy_file")

container_run_and_commit_layer(
  name = "install_git",
  image = "@dotnet_runtime_deps_6_0_10//image",
  commands = [
    " && ".join([
      "apt-get update -y",
      "apt-get install -y git=1:2.30.2-1",
      "apt-get clean",
      "rm -rf /var/lib/apt/lists/*",
      "rm -rf /var/cache/ldconfig/aux-cache",
      "rm -rf /var/log/alternatives.log",
      "rm -rf /var/log/apt/term.log",
      "rm -rf /var/log/apt/history.log",
      "rm -rf /var/log/dpkg.log",
      "rm -rf /var/log/*",
      "rm -rf /var/cache/debconf/templates.dat",
      "rm -rf /var/lib/dpkg/status-old",
      "rm -rf /var/lib/dpkg/status",
      "rm -rf /var/cache/debconf/config.dat",
      "rm -rf /etc/ld.so.cache",
      "rm -rf /var/lib/apt/extended_states",
      "rm -rf /var/log/apt/eipp.log.xz",
      "git --version",
    ]),
  ],
)

container_image(
  name = "image",
  base = "@dotnet_runtime_deps_6_0_10//image",
  layers = [
    ":install_git",
  ],
)

copy_file(
  name = "image_archive",
  src = ":image.tar",
  out = "image_archive.tar",
  is_executable = False,
  allow_symlink = False,
)

test.sh

#!/bin/bash

set -e
set -o pipefail

rm -rf ./bazel-*

bazel clean

bazel build //:image_archive

sha256sum bazel-bin/image_archive.tar

rm -rf ./bazel-*

bazel clean

bazel build //:image_archive

sha256sum bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
  bazel-bin/image_archive.tar
INFO: Elapsed time: 42.970s, Critical Path: 42.10s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
d51cbfa26560fe671e13655b0baa94a3d8426b4cc3a8726c2e4a2e05585ebc6b  bazel-bin/image_archive.tar
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
INFO: Analyzed target //:image_archive (111 packages loaded, 7332 targets configured).
INFO: Found 1 target...
Target //:image_archive up-to-date:
  bazel-bin/image_archive.tar
INFO: Elapsed time: 60.050s, Critical Path: 59.20s
INFO: 73 processes: 24 internal, 49 linux-sandbox.
INFO: Build completed successfully, 73 total actions
3b80585ed7dcf7f27590e48bb48b89d59ce6a1660f6ced7f081711c5e64fd064  bazel-bin/image_archive.tar

πŸ”₯ Exception or Error

N/A

🌍 Your Environment

Operating System:

lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04 LTS
Release:	22.04
Codename:	jammy

Output of bazel version:

bazel --version
bazel 5.3.1

Rules_docker version:

http_archive(
  name = "io_bazel_rules_docker",
  sha256 = "b1e80761a8a8243d03ebca8845e9cc1ba6c82ce7c5179ce2b295cd36f7e394bf",
  urls = ["https://github.com/bazelbuild/rules_docker/releases/download/v0.25.0/rules_docker-v0.25.0.tar.gz"],
)

Anything else relevant?

Nope

commented

Curiously, this seems to work:

container_image(
  name = "image",
  base = "@dotnet_runtime_deps_6_0_10//image",
  layers = [
-    ":install_git",
  ],
+  tars = [
+    ":install_git",
+  ],
)
commented

Also strange is that the hash on GitHub CI and my machine differ.

You call tools in your container which aren't hermetic, like apt-get install - so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.

commented

You call tools in your container which aren't hermetic, like apt-get install - so that tool produces a different output. Bazel can only provide determinism if the tools it runs do.

There are commands to clean up the noise from apt-get (although it is possible something was missed). It appears to be deterministic when using tars but not layers.

This fix also seems to improve remote cacheability, and may help solve #2195.