[v0.14] invalid cached file sent to context with buildx
dwmh opened this issue · comments
Contributing guidelines
- I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- ... the documentation does not mention anything about my problem
- ... there are no open or closed issues that are related to my problem
Description
When using files in build context with same size, filename and timestamp information, new buildx versions use cli side cache incorrectly. I have verified v0.13.1 is the last version to work correctly, v0.14.0 and v0.15.1 have the issue present.
Issue can be reproduced with a simple shell script in Configuration section.
Expected behaviour
Both app1 and app2 in the example provided should get correct input files as they are different.
Actual behaviour
Build of app1 caches code
file and app2 incorrectly uses it.
Buildx version
github.com/docker/buildx v0.15.1 1c1dbb2
Docker info
Client:
Version: 26.1.0
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.15.1
Path: REDACTED
Server:
Containers: 89
Running: 0
Paused: 0
Stopped: 89
Images: 77
Server Version: 26.1.0
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 926c9586fe4a6236699318391cd44976a98e31f1
runc version: 51d5e94601ceffbbd85688df1c928ecccbfa4685
init version: de40ad007797e0dcd8b7126f27bb87401d224240
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.9.4-gentoo
Operating System: Gentoo Linux
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 31.26GiB
Name: REDACTED
ID: Z27R:WA35:4HYF:B4VA:3TJF:CPFY:GAQI:RX3B:UISW:IDVF:QYBQ:TFH2
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Builders list
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default* docker
\_ default \_ default running v0.13.1 linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386
Configuration
#!/bin/sh
# Enable buildkit
export DOCKER_BUILDKIT=1
# Generate dockerfile that builds both apps
cat > Dockerfile <<EOF
FROM alpine:3.20
COPY code /code
CMD /bin/sh /code
EOF
# Generate application code with identical timestamps and sizes
mkdir -p app1 app2
rm -f app1/code app2/code
touch -d 01010101 app1/code app2/code
cat > app1/code <<EOF
#!/bin/sh
echo app1 output
EOF
cat > app2/code <<EOF
#!/bin/sh
echo app2 output
EOF
touch -d 01010101 app1/code app2/code
# Build application images without server side cache
docker build --no-cache -f ./Dockerfile ./app1 --tag localhost/app1:latest
docker build --no-cache -f ./Dockerfile ./app2 --tag localhost/app2:latest
# Verify that incorrect code is used
docker run localhost/app1:latest
docker run localhost/app2:latest
Build logs
+ export DOCKER_BUILDKIT=1
+ DOCKER_BUILDKIT=1
+ cat
+ mkdir -p app1 app2
+ rm -f app1/code app2/code
+ touch -d 01010101 app1/code app2/code
+ cat
+ cat
+ touch -d 01010101 app1/code app2/code
+ docker build --no-cache -f ./Dockerfile ./app1 --tag localhost/app1:latest
[+] Building 1.1s (7/7) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 90B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.20 1.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 26B 0.0s
=> CACHED [1/2] FROM docker.io/library/alpine:3.20@sha256:b89d9c93e9ed3597455c90a0b88a8bbb5cb7188438f70953fede212a0c4394e0 0.0s
=> [2/2] COPY code /code 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:735f91cf114dcd6919cd01d0bca0f9ce754d60e71cb16f53b9b43e9095f2e2c6 0.0s
=> => naming to localhost/app1:latest 0.0s
+ docker build --no-cache -f ./Dockerfile ./app2 --tag localhost/app2:latest
[+] Building 0.4s (7/7) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 90B 0.0s
=> [internal] load metadata for docker.io/library/alpine:3.20 0.2s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 26B 0.0s
=> CACHED [1/2] FROM docker.io/library/alpine:3.20@sha256:b89d9c93e9ed3597455c90a0b88a8bbb5cb7188438f70953fede212a0c4394e0 0.0s
=> [2/2] COPY code /code 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:7bd53c686e8f29a63b4ec3b1bcc2a0785b450f3aee4fc3f8a0597a534080e5e4 0.0s
=> => naming to localhost/app2:latest 0.0s
+ docker run localhost/app1:latest
app1 output
+ docker run localhost/app2:latest
app1 output
Additional info
Originally the issue was discovered in git repositories that happened to add files with same name and size to multiple folders, after which timestamps were aligned by others cloning that repository.
There are at least three workarounds known to me at the moment:
- sleep 0.1 between touching each file before building, this makes timestamps unique
- docker buildx prune -f between builds (slow)
- use named build contexts as advertised in https://www.docker.com/blog/dockerfiles-now-support-multiple-build-contexts/
Found a similar old issue from buildkit side, which was related to Dockerfile handling: moby/buildkit#1368
Filenames and timestamps suggest a similar underlying issue with this regression.
I don't know why you've marked it as regression. This is expected behavior moby/buildkit#4817 if you reset the timestamps of files. Same as rsync and git.
The reasoning is that older versions used to work, however if this is expected we must have relied on a bug.
I would be interested to know how monorepos are expected to setup docker image builds, as you can very easily end up with identical timestamps for different folders if those files are created or modified by same checkouts. Providing the entire monorepo as build context seems wasteful and slow, so my guess would be to utilize named build contexts per subproject. But this feels like a hack since you anyways need to provide the default build context in addition to that, which will basically be an empty folder.
Sorry, this is indeed a regression and fixed in #2558
While the metadata based transfer is correct, this should not affect your example case as you are transferring from a different directory. We have validation that the incremental transfers are used only when the same source directory (basename+node identifier) is reused. That part broke in v0.14 .
You can still have a similar case if you make a non-metadata change to a file and then reset its timestamp, but you shouldn't be able to get this when building two different directories that both contain a file with the same timestamp.