dbhi / qus

qemu-user-static (qus) and containers, non-invasive minimal working setups

Home Page:https://dbhi.github.io/qus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running arm32/v7 containers on an aarch64-only host

Silex opened this issue ยท comments

Hello,

Coming here from multiarch/qemu-user-static#77, it looks like qus works for i386, x84_64 but not arm:

gitlab-runner@docker-emacs:~$ uname -a
Linux docker-emacs 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:21:09 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

gitlab-runner@docker-emacs:~$ docker run --rm --privileged aptman/qus -- -r
cat ./qemu-binfmt-conf.sh | sh -s -- --path=/qus/bin -r

gitlab-runner@docker-emacs:~$ docker run --rm --privileged aptman/qus -s -- -p
cat ./qemu-binfmt-conf.sh | sh -s -- --path=/qus/bin -p --suffix -static
Setting /qus/bin/qemu-i386-static as binfmt interpreter for i386
Setting /qus/bin/qemu-i386-static as binfmt interpreter for i486
Setting /qus/bin/qemu-alpha-static as binfmt interpreter for alpha
Setting /qus/bin/qemu-armeb-static as binfmt interpreter for armeb
Setting /qus/bin/qemu-sparc-static as binfmt interpreter for sparc
Setting /qus/bin/qemu-sparc32plus-static as binfmt interpreter for sparc32plus
Setting /qus/bin/qemu-sparc64-static as binfmt interpreter for sparc64
Setting /qus/bin/qemu-ppc-static as binfmt interpreter for ppc
Setting /qus/bin/qemu-ppc64-static as binfmt interpreter for ppc64
Setting /qus/bin/qemu-ppc64le-static as binfmt interpreter for ppc64le
Setting /qus/bin/qemu-m68k-static as binfmt interpreter for m68k
Setting /qus/bin/qemu-mips-static as binfmt interpreter for mips
Setting /qus/bin/qemu-mipsel-static as binfmt interpreter for mipsel
Setting /qus/bin/qemu-mipsn32-static as binfmt interpreter for mipsn32
Setting /qus/bin/qemu-mipsn32el-static as binfmt interpreter for mipsn32el
Setting /qus/bin/qemu-mips64-static as binfmt interpreter for mips64
Setting /qus/bin/qemu-mips64el-static as binfmt interpreter for mips64el
Setting /qus/bin/qemu-sh4-static as binfmt interpreter for sh4
Setting /qus/bin/qemu-sh4eb-static as binfmt interpreter for sh4eb
Setting /qus/bin/qemu-s390x-static as binfmt interpreter for s390x
Setting /qus/bin/qemu-aarch64_be-static as binfmt interpreter for aarch64_be
Setting /qus/bin/qemu-hppa-static as binfmt interpreter for hppa
Setting /qus/bin/qemu-riscv32-static as binfmt interpreter for riscv32
Setting /qus/bin/qemu-riscv64-static as binfmt interpreter for riscv64
Setting /qus/bin/qemu-xtensa-static as binfmt interpreter for xtensa
Setting /qus/bin/qemu-xtensaeb-static as binfmt interpreter for xtensaeb
Setting /qus/bin/qemu-microblaze-static as binfmt interpreter for microblaze
Setting /qus/bin/qemu-microblazeel-static as binfmt interpreter for microblazeel
Setting /qus/bin/qemu-or1k-static as binfmt interpreter for or1k
Setting /qus/bin/qemu-x86_64-static as binfmt interpreter for x86_64

gitlab-runner@docker-emacs:~$ docker run --rm -it i386/ubuntu bash -c 'uname -m'
i686

gitlab-runner@docker-emacs:~$ docker run --rm -it amd64/ubuntu bash -c 'uname -m'
x86_64

gitlab-runner@docker-emacs:~$ docker run --rm -it arm32v7/ubuntu bash -c 'uname -m'
standard_init_linux.go:211: exec user process caused "exec format error"
failed to resize tty, using default size

Any idea of what I should do? On https://askubuntu.com/questions/1090351/can-i-run-an-arm32-bit-app-on-an-arm64bit-platform-which-is-running-ubuntu-16-04 they suggest adding armhf as a foreign architecure but it looks random, I prefer to ask here first.

Hi @Silex! It seems that you are trying to use qus on an aarch64 host to target armhf. I believe I explicitly excluded armhf/armv7 targets for aarch64 hosts because AFAIK all aarch64 devices are indeed armv8 which is capable of running either aarch64 or aarch32. The later is suppossed to be a superset of armhf/armv7.

Hence, in your case, I would expect arm32v7/ubuntu to work without qus. In fact, I have used either arm32v7 or arm64v8 on RPi 3 with raspbian (which is a 32 bit OS and should not be able to execute arm64v8), without qus.

May I know which device are you trying to run arm32v7 containers on?

Any idea of what I should do? On askubuntu.com/questions/1090351/can-i-run-an-arm32-bit-app-on-an-arm64bit-platform-which-is-running-ubuntu-16-04 they suggest adding armhf as a foreign architecure but it looks random, I prefer to ask here first.

In any case, I think you should not need to add libraries for a foreign architecture to your host. That is precisely the purpose of using a container. So, when using arm32v7 or arm64v8 you are effectively changing the set of standard libraries that the target application will find in the environment.

Hi @Silex! It seems that you are trying to use qus on an aarch64 host to target armhf. I believe I explicitly excluded armhf/armv7 targets for aarch64 hosts because AFAIK all aarch64 devices are indeed armv8 which is capable of running either aarch64 or aarch32. The later is suppossed to be a superset of armhf/armv7.
Hence, in your case, I would expect arm32v7/ubuntu to work without qus. In fact, I have used either arm32v7 or arm64v8 on RPi 3 with raspbian (which is a 32 bit OS and should not be able to execute arm64v8), without qus.

I thought so too before running into this issue ๐Ÿ˜…

May I know which device are you trying to run arm32v7 containers on?

From what I know it's a 96 core ThunderX system running on ubuntu 18.04. Feel free to ask me for more informations, here's parts of /proc/cpuinfo:

root@docker-emacs:~# cat /proc/cpuinfo 
processor	: 0
BogoMIPS	: 200.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x43
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0x0a1
CPU revision	: 1

Any idea of what I should do? On askubuntu.com/questions/1090351/can-i-run-an-arm32-bit-app-on-an-arm64bit-platform-which-is-running-ubuntu-16-04 they suggest adding armhf as a foreign architecure but it looks random, I prefer to ask here first.

In any case, I think you should not need to add libraries for a foreign architecture to your host. That is precisely the purpose of using a container. So, when using arm32v7 or arm64v8 you are effectively changing the set of standard libraries that the target application will find in the environment.

Alright. If you have any suggestions ๐Ÿ˜‰ Or questions I should ask the owner of this machine...

I just tested and confirm that without qus, only arm64v8/ubuntu works and all of i386/ubuntu, amd64/ubuntu and arm32v7/ubuntu fail.

From what I know it's a 96 core ThunderX system running on ubuntu 18.04.

Alright. If you have any suggestions ๐Ÿ˜‰ Or questions I should ask the owner of this machine...

I'm going to assume it's a recent version of ThunderX with support for 64bit only. I'd be glad if you could ask the owner about it. However, this is just curiosity (see below).

I just tested and confirm that without qus, only arm64v8/ubuntu works and all of i386/ubuntu, amd64/ubuntu and arm32v7/ubuntu fail.

That's partially good news. It means that qus works as expected.

  • The qemu-user binaries that are included in the default aptman/qus image that is used on aarch64 host are extracted from http://ftp.debian.org/debian/pool/main/q/qemu/qemu-user-static_4.2-6_arm64.deb.
  • As you see, qemu-arm-static, qemu-armeb-static, qemu-aarch64-static and qemu-aarch64-static exist; but there is no qemu-armhf-static. That feels like a packaging error... It doesn't make sense to have qemu-aarch64* packages in the DEB for an aarch64 host. However:
    • On the one hand, your logs shows that the *be versions are registered only, which is good.
    • On the other hand, it seems that qemu-armhf-static did never exist...

As a result, I'd say that you'd need qemu-arm-static to be registered, but for some reason it is not. Only aarch64_be and armbe are being registered. This leads me to think that it might be some issue with https://github.com/qemu/qemu/blob/master/scripts/qemu-binfmt-conf.sh. Et voilร : https://github.com/qemu/qemu/blob/master/scripts/qemu-binfmt-conf.sh#L139-L170

Since qemu-binfmt-conf.sh registers the targets by families, and arm/armhf/aarch64 is considered the same family, arm is not registered on aarch64: https://github.com/qemu/qemu/blob/master/scripts/qemu-binfmt-conf.sh#L314.

Hence, we have isolated the issue, I think. Now, I would suggest you to:

  • Download the DEB package above, extract it and pick qemu-arm-static only.
  • Build/copy some binary which is for armhf and execute ./qemu-arm-static yourbinary. Try ./yourbinary too. The former should work, the later should fail.

If that works as expected, we need to find the equivalent to running qemu-binfmt-conf.sh for your case only, in order to register qemu-arm-static with -p and have it loaded into memory. Doing so should allow you to run arm32v7 containers, and it should allow me to "fix" the fork of the script that we use here.

Unfortunately, playing with this can be risky. If we modify the script improperly and qemu-aarch-static gets registered on an aarch64 host, weird things will happen. That's why I suggest we go step by step and introduce commands manually first. Slow but secure.

@umarcor: thanks for the fantastic reply. I'll try your steps and report.
@vielmetti: can you answer about the hardware? is the box a ThunderX with support for 64bit only? The problem I have right now is that it does not seem to run 32 bits binaries, but I'll run more tests.

@Silex

ThunderX is 64 bit only, as is ThunderX2
Ampere eMag is 32 and 64 bit.

@vielmetti: thanks for confirming. Will try umarcor's idea which should work, will keep you posted.

@umarcor: I confirm your suggestion works:

root@docker-emacs:~# file hello
hello: ELF 32-bit LSB executable, ARM, EABI5 version 1 (GNU/Linux), statically linked, for GNU/Linux 3.2.0, BuildID[sha1]=99367b08573e864abd79831673e2fbbd7bb1f82f, not stripped

root@docker-emacs:~# ./hello
-bash: ./hello: cannot execute binary file: Exec format error

root@docker-emacs:~# ./qemu-arm-static hello
hello world

I cross compiled a basic hello.c using arm-linux-gnueabihf-gcc -static inside an arm64v8 container and copied the binary to the host. I extracted qemu-arm-static from apt download qemu-user-static, do you need me to test with your debian package instead? This box runs ubuntu 18.04.

@vielmetti, thanks for clarifying.

@Silex, your tests are ok. The qemu-arm-static you extracted is likely to be the exact same version (or very similar), just retrieved with a different procedure.

I wrote a temporal fix for qemu-binfmt-conf.sh which splits family 'arm' into 'arm' and 'aarch64', and 'armeb' into 'armeb' and 'armeb64'. As a result, you should get all but aarch64 registered. In order to try it, please follow these steps:

$ docker run --rm --privileged -itv $(pwd)/qemu-binfmt-conf.sh:/qus/qemu-binfmt-conf.sh --entrypoint=sh aptman/qus
# cat /qus/qemu-binfmt-conf.sh | grep armeb64

grep should find some result.

  • If successful, use it to run the "regular" command:
docker run --rm --privileged -v $(pwd)/qemu-binfmt-conf.sh:/qus/qemu-binfmt-conf.sh aptman/qus -s -- -p

That will hopefully allow you to run ./hello and to use arm32v7 containers on an aarch64 only host. It is not a clean solution, though, because it breaks backwards compatibility. Ideally, the behaviour you need would need to be explicitly enabled through some option/swtich. However, I'd like to think about it after you have tried the current proposal.

@umarcor: it works! ๐ŸŽ‰

The only weird thing is that the first time I run it for other architectures it seems to output the output twice, but maybe that's just this particular session:

root@docker-emacs:~# docker run --rm -it i386/ubuntu bash -c 'uname -m'
i686i686

root@docker-emacs:~# docker run --rm -it i386/ubuntu bash -c 'uname -m'
i686

root@docker-emacs:~# docker run --rm -it arm32v7/ubuntu bash -c 'uname -m'
armv7l

Thanks a lot, I'll run some tests and report if I find weird things.

Oh btw, I'm not sure you need to fix this in qus... 64bits-only arm architectures are pretty rare no? I mean for me an entry in the wiki would be ok.

@umarcor: it works! ๐ŸŽ‰

Awesome! I'm glad it was such an easy fix :D

The only weird thing is that the first time I run it for other architectures it seems to output the output twice, but maybe that's just this particular session

I think it must be some transitient and not important issue. Should it be some annoying intermitent bug, it is definitely not related to the last fix: umarcor/qemu@699f972

Thanks a lot, I'll run some tests and report if I find weird things.

Your are welcome! And thank you for helping have qus tested on not-so-mainstream platforms!

Nevertheless, note that you are likely to find bugs if you run complex applications, specially the ones that use signals. Furthermore, you should expect a 4-10x execution time penalty. Unfortunately, this is unrelated to qus (which is a helper/wrapper project), but dependent on QEMU's development history. See https://github.com/dbhi/qus/blob/master/docs/context.md. Fortunately, qemu-user and ARM targets seem to be gaining increased attention.

Regarding combination of Docker and QEMU system, you might want to raise your hand in kata-containers/runtime#1280.

Oh btw, I'm not sure you need to fix this in qus... 64bits-only arm architectures are pretty rare no? I mean for me an entry in the wiki would be ok.

As said, this is not really related to qus, but to QEMU. The modifications that were made to qemu-binfmt-conf.sh will hopefully be upstreamed some day: https://patchew.org/search?q=project%3AQEMU+qemu-binfmt-conf.sh

Hence, in order to avoid keeping a forked/custom script after all other tweaks/enhancements are upstreamed, I think that the fix should be upstreamed too. From my point of view, there is currently a bug in all oficial qemu-user packages that prevents you from having a working default installation. I.e., you do need to tweak qemu-binfmt-conf.sh regardless of using qus.

EDIT

https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg03667.html

64-bits only Arm architectures are characteristic of the newest Arm server and supercomputer designs, including Cavium ThunderX, Marvell ThunderX2, Qualcomm Centriq (now defunct).

Nevertheless, note that you are likely to find bugs if you run complex applications, specially the ones that use signals. Furthermore, you should expect a 4-10x execution time penalty. Unfortunately, this is unrelated to qus (which is a helper/wrapper project), but dependent on QEMU's development history. See https://github.com/dbhi/qus/blob/master/docs/context.md. Fortunately, qemu-user and ARM targets seem to be gaining increased attention.

Well, yes that's the story of my life so far. I'm trying to cross-build docker images for Emacs (Silex/docker-emacs#38) and I run into issues after issues, because Emacs is both complex and requiring weird configuration (namely having to disable ASLR) in order to build (https://bugs.launchpad.net/qemu/+bug/1861161 is an example of a QEMU bug I usually run into from amd64 hosts).

Anyway, I think it's now safe to say that the QEMU way is a dead-end for Emacs. The only way would be to build each image on the right architecture (tho I never had issues building i386/amd64 on the same architecture, so I expect an arm32/arm64 capable host to build Emacs images without any issues as well).

So, basically this means I can use this server to build arm64 images, I can use my own server to build i386/amd64 images and I just need to find one more server to build arm32 \o/

@NicolasPetton, @vielmetti: sorry to ask again but could you provide an arm32 host? I'd then need to refactor and split the monolitic design of the current building process into 3 gitlab workers where each workers builds only for its architecture, with all the fun it implies ๐Ÿ˜…

EDIT

@vielmetti: I saw your comment about Ampere eMag being both 32 and 64 bit, would it be complicated for me to use that kind of servers instead?

64-bits only Arm architectures are characteristic of the newest Arm server and supercomputer designs, including Cavium ThunderX, Marvell ThunderX2, Qualcomm Centriq (now defunct).

@vielmetti, since I believe this trend is not going to slow down, I think it would be good to make the upstream qemu-binfmt-conf.sh script properly identify them. Are you aware of any short easy test that I can run from a shell script to tell your 64-bit only machines (ThunderX, ThunderX2, etc.) apart from others (eMag)?

I just need to find one more server to build arm32

@Silex I guess you took a really hard target to fight with ;)

FYI we found an alternative setup that seems to work for us, so this issue is not blocking anymore. I'll keep it open so you can decide what to do with it.

About your question on how to detect these arm64-only architectures, I don't know but my google-fu found this:

https://community.arm.com/developer/ip-products/processors/f/cortex-a-forum/6664/how-to-do-the-arm-state-change-between-64-bit-and-32-bit

If the hardware supports AArch32 and AArch64 at all exception levels then the RMR_ELx register (where x is the highest implemented exception level, typically x=3) might be used to switch the execution state at the highest exception level by resetting the processor. But that assumes you have access to the highest exception level. (EL3 is 2 levels above where the OS kernel executes, for example).

I notice cat /proc/cpuinfo gives this:

processor	: 31
BogoMIPS	: 80.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x50
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0x000
CPU revision	: 2

If you compare to the ThunderX one, we see the CPU variant moved from 0x1 to 0x3. The CPU part also changed. Not sure if that helps.

Yes, please, leave it open. I believe this is a bug upstream that should be fixed.

Thanks for the reference and the hint about the CPU variant. Unfortunately, I cannot focus on this issue at the moment, but I will follow those leads when I come back to it.

Hello -

I know this thread has been quiet for a long time, but I wanted to add another processor (family) affected by this issue. I recently moved the docker stack running on my M1 Mac Mini into an Ubuntu VM and began having issues with some images listed as "aarch64" no longer running. They all expected 32-bit support to be present, but it wasn't in the VM. These images, along with the modified qemu-binfmt-conf.sh got the image working again in the troubled image!

Similar to what @rborkow says, in order for me to build Selenium docker container images for armhf (linux/arm/v7), I had to use the modified qemu-binfmt-conf.sh script when registering the architectures. Otherwise, I was only able to build for aarch64 and amd64. I'm on a Mac M1 as well. The folks using the armhf container images are running them on Raspberry Pi, which I believe wouldn't be able to run the aarch64 images. Hope this info helps, and thanks to all who contributed to the modified script.

FTR, all qus images include the fix now; so, using the modified script should not be required anymore.