raspberrypi / firmware

This repository contains pre-compiled binaries of the current Raspberry Pi kernel and modules, userspace libraries, and bootloader/GPU firmware.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Boot firmware modifies/writes to FAT filesystem on SD card, breaking dm-integrity

jplarocque opened this issue · comments

Describe the Bug

Hi,

The Raspberry Pi firmware seems to write to the SD card at bootup, modifying the boot FAT filesystem—the one typically mounted at /boot/firmware, containing the firmware being booted, the kernel image, device tree blobs, initrd image, and configuration files like config.txt and cmdline.txt.

To Reproduce

Clone the latest Raspberry Pi firmware (commit 5c83250 in my case):

$ git clone --depth 1 https://github.com/raspberrypi/firmware.git

Run a script I wrote (attached) to generate a minimal SD card image. For convenience, I'm also attaching a compressed output image. The image size is 256 MiB, containing a single partition with just the firmware, and not containing a Linux kernel, initrd, or any root filesystem. Then write the image to an SD or microSD card:

$ sudo ./make_image.sh
[various verbose output omitted]
# I'm in the `disk` group; you may need to wrap with `sudo sh -c '...'`; and
# adjust the destination path as appropriate:
$ cat image > /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1

Let's be absolutely certain that the image was written without any corruption:

$ cmp image /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1
cmp: EOF on image after byte 268435456, in line 94408

Move the card to a Raspberry Pi and try booting from it. Wait just a few seconds after the activity lights stop blinking; it won't take long, since there's no kernel to load or boot.

Unpower the Pi, and move the card back to your computer. Take an image of the card (limited to just 256 MiB, matching the original image; and may need sudo):

$ head -c 256MiB /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1 > after_rpi3.img

Compare the images:

$ cmp image after_rpi3.img
image after_rpi3.img differ: byte 1049581, line 3
$ diff -u <(hd image) <(hd after_rpi3.img) | dwdiff -u
--- /dev/fd/63  2024-05-21 19:51:57.006203947 -0700
+++ /dev/fd/62  2024-05-21 19:51:57.006203947 -0700
@@ -27,7 +27,7 @@
 00100200  52 52 61 41 00 00 00 00  00 00 00 00 00 00 00 00  |RRaA............|
 00100210  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
 001003e0  00 00 00 00 72 72 41 61  76 24 07 00 [-09-] {+0a+} b4 00 00  |....rrAav$......|
 001003f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
 00100400  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
 *
$ sudo losetup -o 1MiB --show -rf image
/dev/loop1
$ sudo losetup -o 1MiB --show -rf after_rpi3.img 
/dev/loop2
$ diff -u <(minfo -i /dev/loop1 ::) <(minfo -i /dev/loop2 ::)
Could not get geometry of device (Inappropriate ioctl for device)Could not get geometry of device (Inappropriate ioctl for device)--- /dev/fd/63     2024-05-21 19:54:40.478537134 -0700
+++ /dev/fd/62  2024-05-21 19:54:40.478537134 -0700
@@ -1,6 +1,6 @@
 device information:
 ===================
-filename="/dev/loop1"
+filename="/dev/loop2"
 sectors per track: 32
 heads: 16
 cylinders: 1020
@@ -8,7 +8,7 @@
 media byte: f8
 
 mformat command line:
-  mformat -t 1020 -h 16 -s 32 -r 0 -c 1 -m 248 -i "/dev/loop1" ::
+  mformat -t 1020 -h 16 -s 32 -r 0 -c 1 -m 248 -i "/dev/loop2" ::
 
 bootsector information
 ======================
@@ -41,4 +41,4 @@
 Infosector:
 signature=0x41615252
 free clusters=468086
-last allocated cluster=46089
+last allocated cluster=46090
$ sudo losetup -d /dev/loop1 /dev/loop2

Expected Behavior

After the above steps, I would expect cmp image after_rpi3.img to produce no output, and exit with code 0.

Generally, I never expected the early boot process of a Raspberry Pi to write to or modify the contents of its SD card. I thought this could/should only happen at some point after the kernel has loaded.

Actual Behavior

The Raspberry Pi firmware writes to the SD card, modifying the contents of the FAT filesystem from which it has booted. This occurs before the kernel loads.

The purpose of make_image.sh is to produce an extremely simple cut-down test image, where the buggy behavior matches the behavior I found in my real-world installation. It also demonstrates that the problem is not in the Linux kernel, initrd, or anywhere in userland, since those files are never copied into the image. The script does generate small dummy (all-zeroes) kernel.img, etc. files in lieu of copying the Linux kernel images. This is because without those files, the firmware doesn't seem to advance far enough in its boot process to reproduce this issue. Giving it kernel files with no usable code is sufficient, though.

System

  • Which model of Raspberry Pi? Reproduced on a Raspberry Pi 1 Model B rev. 2, and on a Raspberry Pi 3 Model B+ rev. 1.3.

  • Which OS and version? Originally found with a Debian Bookworm system (so no /etc/rpi-issue), but seems to occur independent of the OS/distribution.

  • Which firmware? No vcgencmd on Debian, but I found it on a system with Debian's raspi-firmware package version 1.20220830+ds-1, as well as with commit 5c83250 in this repo.

  • Which kernel version? Originally found with a system reporting Linux mosaik 6.1.0-21-rpi #1 Debian 6.1.90-1 (2024-05-03) armv6l GNU/Linux, but seems to occur independent of the kernel version.

Additional Context

Surprising and Undocumented Behavior

I think it's bad for the firmware to write to the SD card in principle. There's no functionality that I'm aware of in this early, pre-kernel boot process that suggests that writing to the card is required, or could ever happen.

I've scanned the documentation in these pages, which seem to cover the boot process the most, and couldn't find this behavior documented:

Breaks dm-integrity

This behavior causes problems for me because I'm trying to use dm-integrity for protection against data corruption in /boot/firmware. I've found microSD cards and the ecosystem surrounding them to be unreliable:

  • Even the good Samsung ones, in various readers. I've had to RMA a brand-new Samsung EVO Select for silent data corruption. Though I haven't tried industrial-grade cards yet.

  • Even those little USB 2.0 microSD readers that CanaKit used to (or still do) ship will occasionally silently corrupt data. I should have reported it, but I left the employer who was buying those kits, and don't have any CanaKit-branded readers to test with again in my personal time.

  • And yes, I experience this with good power supplies, and without unclean shutdowns or unexpected power loss.

While microSD just isn't dependable in my experience, ultimately any storage device of any medium/type/format will some day fail.

For Raspberry Pis I run where availability is important to me, I try to use RAID1-equivalent protection using btrfs or ZFS for the root filesystem, mirroring data with another microSD card in an external USB card reader. Since the Raspberry Pi and its firmware require FAT for the firmware filesystem, I'm trying to use dm-integrity to catch any data corruption on that partition. On top of dm-integrity, I layer on mdraid level 1 with metadata format 1.0 (metadata stored at the end of the device), mirroring the contents onto another microSD card in a USB card reader which is formatted the same way. If I ever have trouble booting the main microSD card, I can just swap them, and then make sure that a RAID scrub fixes the problem on both disks (and also consider whether to replace the first card). And a periodic RAID scrub cronjob should make sure that any corrupted writes or bitrot will eventually be caught and corrected.

The gist of dm-integrity is that it maintains a checksum of every sector, so that it can tell when data has been silently corrupted and present the condition to higher layers as a read error rather than incorrect data. Usually it interleaves checksums with data sectors in the underlying storage medium, but in this case, to allow the firmware which isn't aware of dm-integrity to be able to read the filesystem, I put the checksums and other dm-integrity-specific data on a separate partition (formatting and opening with integritysetup --data-device ...).

This is why the Raspberry Pi firmware writes to the device causes me grief: it successfully reads the FAT filesystem which is layered on top of mdraid 1.0 and dm-integrity, but the act of changing the filesystem without the awareness to update dm-integrity checksums invalidates those checksums. dm-integrity then reports read errors, as intended, when the system boots up.

Thank you for reading my report. Would you please consider updating the Raspberry Pi firmware so that it no longer writes to the SD card at boot?

Before considering the possibility that the firmware actually may write to SD card, there is a fault in your test procedure. You should repeat the test but without turning on the Raspberry Pi, i.e. remove the card from the PC and reinsert it, then look for differences. You can also repeat this test to see if it changes again or if it is a one-off.

That's a fair point. Here is my result immediately after writing the image:

$ cmp image /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1
cmp: EOF on image after byte 268435456, in line 94408

(So the initial 256 MiB of the SD card matches the entirety of the image.)

Here's the result after removing and re-inserting the card:

$ cmp image /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1
cmp: EOF on image after byte 268435456, in line 94408

And here's the result after trying to boot my Pi 1 Model B with it:

$ cmp image /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1
image /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0:1 differ: byte 1049581, line 3
$ cmp after_rpi1.img /dev/disk/by-id/usb-TS-RDF8_SD_Transcend_000000079-0\:1
cmp: EOF on after_rpi1.img after byte 268435456, in line 94409

(Mismatch to the written image, exact match to a prior image I took after reproducing this problem.)

Also, consider that I couldn't reproduce the problem with no kernel*.img files; I had to create bogus ones. If it was wonkiness on my computer causing this, then it would have caused it then too, but it was worthwhile to double check.

As far as repeating the test again in case it was a one-off, I've reliably repeated this test over a dozen times while preparing my report, and even checked and reproduced it on another model of Pi. Can you take a look to see if you can reproduce it on one of your Pis? It should only take a spare SD card that you don't mind overwriting, and a few minutes of work.

Thanks,
-Jean-Paul

To preempt some other concerns that may be raised:

I've reproduced this issue with three models of SD card:

  • SanDisk, "SDHC Card", Class 4 mark, "8 GB"
  • PNY "SD", "1 GB"
  • Onefavor "TF card" "2 GB" (weird little guys I picked up from AliExpress for netbooting), via a Samsung branded "SD Adapter for microSD"

I've reproduced the issue with three models of card reader, getting the exact same results before and after (including the exact same changed byte after booting with the Pi):

  • Transcend TS-RDC8K USB 3.1 multi-interface card reader, including double-checking after removing and re-inserting the card and running cmp between the image and the card.
  • PNY microSD card reader, model unknown, idVendor=0bda, idProduct=0109, USB product string "USB2.0-CRW", double-checked by removing and re-inserting the card reader with a microSD card still in it, then running cmp.
  • SanDisk SDDR-339 micro SD UHS-II USB 3.0 Reader, double-checked by removing and re-inserting the card reader with a microSD card still in it, then running cmp.

The Raspberry Pi 1 Model B rev. 2 that I tested with is powered by a Riden RD6006P bench supply set for 5.25 V and 5 A, through a combination of 1 m 14 AWG copper wire leads and 0.5 m 22 AWG copper wire leads. These connect to pins 4 and 6 of P1 of the Raspberry Pi. (Excessive detail edited out, because it was snarky and unproductive. I apologize for that.)

The Raspberry Pi 3 Model B+ rev. 1.3 that I tested with is powered by a Samsung Travel Adapter model EP-TA10JWE rated for 5.3 V 2.0 A output. A 1.2 m USB Type A to micro B cable is used to connect the Samsung power supply to micro USB connector J1 on the Raspberry Pi, and this cable was marketed as 20 AWG when I bought it. On the cable, the jacket is marked: "20AWG+2C".

Please let me know if there's anything else I can check, or any other information that might be helpful.

To our general surprise, it turns out that you are correct. Although the firmware makes no attempt to write anything to the card, the filesystem layer unconditionally writes back that sector, and because it has a slightly different idea of what the next cluster hint should be you see a one-off change.

I've attached a trial version of the firmware (just start.elf and fixup.dat) that should never write anything to the card. Let me know how you get on with it.

sdro_firmware_240523.zip

Wonderful, I can confirm that it fixes the issue with my test image on all models I've tested with:

  • Raspberry Pi 1 Model B rev. 2
  • Raspberry Pi 2 Model B V1.1 (dug up for more testing)
  • Raspberry Pi 3 Model B+ rev. 1.3

For the original system installation where I found the problem, I had to disable gpu_mem=16 in config.txt to get the fix to work, since (if I understand correctly) it was loading start_cd.elf and fixup_cd.dat instead of the versions of start.elf and fixup.dat that you provided.

Thank you for the fix!

The firmware patch the prevents all writes has been merged, and all future firmware releases will include it.