docker-library / official-images

Primary source of truth for the Docker "Official Images" program

Home Page:https://hub.docker.com/u/library

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

s390x build images on mariadb/node timed out

grooverdan opened this issue · comments

ref MDEV-31529

https://doi-janky.infosiftr.net/job/multiarch/job/s390x/job/mariadb/

seems they have been for while.

Logs are slowing what appears to be a normal build, its just goes over the 3h timeout.

amd64 builds are 24 seconds and there's nothing different on the s390x builds.

Even a build stage of s390x under qemu is only taking 4 minutes (https://buildbot.mariadb.org/#/builders/311/builds/16623).

Other instances:

Other official images on s390x appear to be failing for a different reason.

I'm trying to run a test to verify, but my best guess is that buildah that you're using there sets up the RUN environment differently than Docker and that this is actually some debconf prompt hanging (https://bugs.debian.org/929417) 😬

Does buildah do something different for something like this than FD 0 being open to /dev/null?

$ docker build - <<<$'FROM bash\nRUN ls -l /proc/self/fd/0; false'
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM bash
 ---> 9877ef007e1a
Step 2/2 : RUN ls -l /proc/self/fd/0; false
 ---> Running in d1548625d92e
lrwx------    1 root     root            64 Jun 23 00:16 /proc/self/fd/0 -> /dev/null
The command '/bin/sh -c ls -l /proc/self/fd/0; false' returned a non-zero code: 1

Aha, reproduced and got docker top on it:

$ docker top optimistic_benz
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
root                1095793             1095772             0                   17:26               ?                   00:00:00            /bin/sh -c set -ex; ?{ ??echo "mariadb-server" mysql-server/root_password password 'unused'; ??echo "mariadb-server" mysql-server/root_password_again password 'unused'; ?} | debconf-set-selections; ?apt-get update; ?apt-get install -y --no-install-recommends mariadb-server="$MARIADB_VERSION" mariadb-backup socat ?; ?rm -rf /var/lib/apt/lists/*; ?rm -rf /var/lib/mysql; ?mkdir -p /var/lib/mysql /var/run/mysqld; ?chown -R mysql:mysql /var/lib/mysql /var/run/mysqld; ?chmod 777 /var/run/mysqld; ?find /etc/mysql/ -name '*.cnf' -print0 ??| xargs -0 grep -lZE '^(bind-address|log|user\s)' ??| xargs -rt -0 sed -Ei 's/^(bind-address|log|user\s)/#&/'; ?printf "[mariadb]\nhost-cache-size=0\nskip-name-resolve\n" > /etc/mysql/mariadb.conf.d/05-skipcache.cnf; ?if [ -L /etc/mysql/my.cnf ]; then ??sed -i -e '/includedir/ {N;s/\(.*\)\n\(.*\)/\n\2\n\1/}' /etc/mysql/mariadb.cnf; ?fi
root                1096121             1095793             2                   17:26               ?                   00:00:00            apt-get install -y --no-install-recommends mariadb-server=1:10.9.7+maria~ubu2204 mariadb-backup socat
root                1096888             1096121             0                   17:27               pts/0               00:00:00            /usr/bin/dpkg --status-fd 26 --configure --pending
root                1096954             1096888             0                   17:27               pts/0               00:00:00            /usr/bin/perl -w /usr/share/debconf/frontend /var/lib/dpkg/info/mariadb-server.postinst configure
root                1096966             1096954             0                   17:27               pts/0               00:00:00            /bin/bash /var/lib/dpkg/info/mariadb-server.postinst configure
root                1096988             1096966             0                   17:27               pts/0               00:00:00            bash /usr/bin/mariadb-install-db --rpm --cross-bootstrap --user=mysql --disable-log-bin --skip-test-db
root                1096989             1096966             0                   17:27               pts/0               00:00:00            logger -p daemon err -t mariadb-server.postinst -i
root                1097011             1096988             0                   17:27               pts/0               00:00:00            bash /usr/bin/mariadb-install-db --rpm --cross-bootstrap --user=mysql --disable-log-bin --skip-test-db
root                1097012             1096988             0                   17:27               pts/0               00:00:00            bash /usr/bin/mariadb-install-db --rpm --cross-bootstrap --user=mysql --disable-log-bin --skip-test-db
root                1097013             1096988             0                   17:27               pts/0               00:00:00            bash /usr/bin/mariadb-install-db --rpm --cross-bootstrap --user=mysql --disable-log-bin --skip-test-db
root                1097014             1097011             0                   17:27               pts/0               00:00:00            cat /usr/share/mysql/mysql_system_tables.sql /usr/share/mysql/mysql_performance_tables.sql /usr/share/mysql/mysql_system_tables_data.sql /usr/share/mysql/fill_help_tables.sql /usr/share/mysql/maria_add_gis_sp_bootstrap.sql /usr/share/mysql/mysql_sys_schema.sql
root                1097015             1097012             0                   17:27               pts/0               00:00:00            sed -e /@current_hostname/d
systemd+            1097016             1097013             87                  17:27               pts/0               00:00:12            /usr/sbin/mariadbd --lc-messages-dir=/usr/share/mysql/english/.. --bootstrap --silent-startup --basedir=/usr --datadir=/var/lib/mysql --log-warnings=0 --enforce-storage-engine= --plugin-dir=/usr/lib/mysql/plugin --disable-log-bin --user=mysql --max_allowed_packet=8M --net_buffer_length=16K

So it seems like whatever initialization is happening here is maybe just really slow? (our s390x machine isn't as speedy as some we've had in the past 🙈)

What's sad is we throw away this initialization with rm /var/lib/mysql in about the next step.

I think I can get it to skip the intense bit of this this step by faking an already installed datadir pre-installation. Thanks for the analysis.

Does this really account for a 3 hr timeout though?

$ buildah bud 
STEP 1/2: FROM bash
STEP 2/2: RUN ls -l /proc/self/fd/0; false
lr-x------    1 root     root            64 Jun 23 00:59 /proc/self/fd/0 -> pipe:[613333]
Error: building at STEP "RUN ls -l /proc/self/fd/0; false": while running runtime: exit status 1

maybe not needed by FYI anyway.

The thing I didn't noticed was that 10.5/10.6 built successfully.

10.5/10.6 - focal base succeed
10.9+ jammy base failed.

Jammy has liburing and therefore its being using the database initialization step. This may have hung the process.
This could explains why it worked on our buildbot under emulation - no liburing:

podman run --rm --arch s390x --env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=1  quay.io/mariadb-foundation/mariadb-devel:10.9
..
2023-06-23  4:02:26 0 [Warning] mariadbd: io_uring_queue_init() failed with ENOSYS: check seccomp filters, and the kernel version (newer than 5.1 required)

On node:
node 20 alpine/alpine-3.17 fail
node 20 debian succeeded
node 16,18 alpine/debian all succeed

(no idea of difference here)

Is the kernel the latest on this builder?

MariaDB/mariadb-docker#518 that I merged will eliminate the slow part of the build, and any uring usage and bypass the potential problem. If this succeeds next week, can you make a note of the kernel version and the filesystem on which docker was building so some kernel people can trace this down.

Is the kernel the latest on this builder?

It's not latest latest, but should be fairly recent (and userspace packages are up to date from Debian Bullseye): 5.10.0-21-s390x