anatol / booster

Fast and secure initramfs generator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RAID0 btrfs root missing device

Cytraen opened this issue · comments

I get the following error (or a variation of it with the other device in the array) after the bootloader:

[1.445824] BTRFS error (device nvme0n1p2): devid 2 uuid 5ee17996-0a1f-40a6-b
7b3-b12b5a35a2e4 is missing
[1.446384] BTRFS error (device nvme0n1p2): failed to read the system array: -2
[1.446506] BTRFS error (device nvme0n1p2): open_ctree failed

My systemd-boot entry looks like this:

title   Arch linux w/ Booster
linux   /vmlinuz-linux
initrd  /amd-ucode.img
initrd  /booster-linux.img
options root=UUID=beb9b779-e5a1-43ea-be97-df3148393954 resume=UUID=5ef5e6dc-ac6e-47fb-93ab-591e7e21a6b5 rw rootflags=relatime,ssd,space_cache=v2,subvolid=5,subvol=/

I have tried setting universal in the config, and adding btrfs to each of the modules and modules_force_load lists.

Output of blkid
/dev/nvme0n1p1: UUID="BBC3-10DA" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="7d8f74f7-9a34-4b6e-ad8a-caf11ca20769"
/dev/nvme0n1p2: UUID="beb9b779-e5a1-43ea-be97-df3148393954" UUID_SUB="09f5696a-f51e-4b05-96b6-53db26ff1536" BLOCK_SIZE="4096" TYPE="btrfs" PARTUUID="698a8e99-9f38-477f-9e6e-623320cfa464"
/dev/nvme1n1p2: UUID="beb9b779-e5a1-43ea-be97-df3148393954" UUID_SUB="5ee17996-0a1f-40a6-b7b3-b12b5a35a2e4" BLOCK_SIZE="4096" TYPE="btrfs" PARTUUID="2b4f5e6f-b022-4972-9a38-f5e39e778106"
/dev/nvme1n1p1: UUID="5ef5e6dc-ac6e-47fb-93ab-591e7e21a6b5" TYPE="swap" PARTUUID="d01ab3bd-d24e-40ad-a010-7565d6c36335"
fstab
# /dev/nvme0n1p2
UUID=beb9b779-e5a1-43ea-be97-df3148393954	/         	btrfs     	rw,relatime,ssd,space_cache=v2,subvolid=5,subvol=/	0 1

# /dev/nvme0n1p1
UUID=BBC3-10DA      	/boot     	vfat      	rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro	0 2

# /dev/nvme1n1p1
UUID=5ef5e6dc-ac6e-47fb-93ab-591e7e21a6b5	none      	swap      	defaults  	0 0

Thank you for the report.

What is the mkfs.btrfs command that initializes such configuration?

mkfs.btrfs -f -d raid0 /dev/nvme0n1p2 /dev/nvme1n1p2

Possibly related to #193, I believe there's supposed to be a hook for btrfs that waits for devices to be ready before attempting to mount

This issue is not related to #193. It is actually something similar to #97.

Right now, booster tries to mount the btrfs array as soon as one device is found. Instead, the boosters' btrfs integration code should learn to handle the array assembly incrementally.

I added an integration test that exposes this problem.

I checked btrfs toolset and see btrfs device ready $device command that says Wait until all devices of a multiple-device filesystem are scanned and registered within the kernel module that sounds like what we need here.

I checked the tool sourcecode and all this command does is BTRFS_IOC_DEVICES_READY ioctl. So I added a patch that calls this ioctl.

Unfortunately, it does not work. Here is the kernel log from the test:

[    1.428391] booster: loading module btrfs
[    1.515173] Btrfs loaded, crc32c=crc32c-intel, zoned=yes, fsverity=yes
[    1.516205] booster: udev event {Action:add KObj:/devices/virtual/misc/btrfs-control Env:map[ACTION:add DEVNAME:btrfs-control DEVPATH:/devices/virtual/misc/btrfs-control MAJOR:10 MINOR:234 SEQNUM:1137 SUBSYSTEM:misc]}
[    1.518283] booster: udev event {Action:add KObj:/module/btrfs Env:map[ACTION:add DEVPATH:/module/btrfs SEQNUM:1138 SUBSYSTEM:module]}
[    1.519611] booster: !!!!!!! BTRFS_IOC_DEVICES_READY for /dev/sda1
[    1.520681] BTRFS: device fsid 5eaa0c1c-e1dc-4be7-9b03-9f1ed5a87289 devid 1 transid 8 /dev/sda1 scanned by init (178)
[    1.522213] booster: mounting /dev/sda1->/booster.root, fs=btrfs, flags=0x0, options=
[    1.523610] booster: udev event {Action:add KObj:/devices/virtual/bdi/btrfs-1 Env:map[ACTION:add DEVPATH:/devices/virtual/bdi/btrfs-1 SEQNUM:1139 SUBSYSTEM:bdi]}
[    1.523790] BTRFS info (device sda1): using crc32c (crc32c-intel) checksum algorithm
[    1.526300] BTRFS info (device sda1): using free space tree
[    1.527574] BTRFS error (device sda1): devid 2 uuid af7b3f58-e32f-49ac-aa8d-5c82b046a466 is missing
[    1.528850] BTRFS error (device sda1): failed to read the system array: -2
[    1.530117] BTRFS error (device sda1): open_ctree failed
[    1.531034] booster: mount(/dev/sda1): invalid argument
[    1.531807] booster: udev event {Action:remove KObj:/devices/virtual/bdi/btrfs-1 Env:map[ACTION:remove DEVPATH:/devices/virtual/bdi/btrfs-1 SEQNUM:1140 SUBSYSTEM:bdi]}

[ 1.520681] BTRFS: device fsid 5eaa0c1c-e1dc-4be7-9b03-9f1ed5a87289 devid 1 transid 8 /dev/sda1 scanned by init (178) is caused by the BTRFS_IOC_DEVICES_READY and my expectation that by the end of ioctl this array should be ready to use. Unfortunately, mounting still fails.

Does BTRFS_IOC_DEVICES_READY really wait until the whole array is ready? cc some folks who know the kernel btrfs logic better @adam900710 @fdmanana

I just pushed a commit that aims to fix this bug. Could you please pull wip branch, build it and see if booster works for you?

I just pushed a commit that aims to fix this bug. Could you please pull wip branch, build it and see if booster works for you?

Sorry for hijacking this topic, I didn't saw it yesterday and created this in discussion section #195. I tried this new commit but didn't change anything, but maybe I'm doing something wrong.

If I have two encrypted disks I only need to put once root information for it?

title Arch Linux
linux /vmlinuz-linux
initrd /amd-ucode.img
initrd /booster-linux.img
options rd.luks.name=0d363017-ac94-458f-9fc0-803c6b114465=additionaldisk rd.luks.name=6e4c6f50-ab67-429e-b5d8-5ef53929f429=samsung root=UUID=726660ee-aab3-4025-aed5-3b58961f0616 rw rootflags=compress-force=zstd,discard=async,subvol=arch

Hi @roland-rollo, your issue sounds related to this one.

Could you please enable debug log with booster.log=debug,console Linux boot option and then share the information printed during the boot?

I booted with

options rd.luks.name=0d363017-ac94-458f-9fc0-803c6b114465=additionaldisk rd.luks.name=6e4c6f50-ab67-429e-b5d8-5ef53929f429=samsung root=UUID=726660ee-aab3-4025-aed5-3b58961f0616 rw rootflags=compress-force=zstd,discard=async,subvol=arch booster.log=debug,console

but doesn't see this with journal -b. Is there procedure for capture all this text properly (if I'm dumb) or only shooting a photo is an option?
signal-2022-11-04-230541_002

After this last line I can't do anything, only reboot.

@roland-rollo

I expect that the kernel logs also have a line like BTRFS: device fsid 5eaa0c1c-e1dc-4be7-9b03-9f1ed5a87289 devid 1 transid 8 /dev/sda1 scanned by init (178) right before mount(). Did you rebuild the image after the booster upgrade using /usr/lib/booster/regenerate_images?

@anatol Yes, I used /usr/lib/booster/regenerate_images. I didn't see anything in journalctl, like it isn't recorded to it. I tried today build booster and recreate everything, but again with the same result.

@roland-rollo I pushed another proposed fix to wip branch. Please try it let me know if it helps with your btrfs raid0 configuration.

@anatol Gloria, May Linus Torvalds bless you! It show error, but booted without problem.

Without scrolling, my configuration:

sudo blkid
/dev/mapper/samsung: LABEL="blackrainbow-root" UUID="726660ee-aab3-4025-aed5-3b58961f0616" UUID_SUB="623fb88c-925c-4bfb-ba9c-ebbd852a8e3d" BLOCK_SIZE="4096" TYPE="btrfs"
/dev/nvme0n1: UUID="0d363017-ac94-458f-9fc0-803c6b114465" TYPE="crypto_LUKS"
/dev/mapper/stary: LABEL="blackrainbow-root" UUID="726660ee-aab3-4025-aed5-3b58961f0616" UUID_SUB="5636ac3e-b62d-4088-9c0c-ec53a95e5936" BLOCK_SIZE="4096" TYPE="btrfs"
/dev/nvme1n1p2: UUID="6e4c6f50-ab67-429e-b5d8-5ef53929f429" TYPE="crypto_LUKS" PARTUUID="98982efc-55ec-ea41-9af3-58911c471435"
/dev/nvme1n1p1: UUID="52A4-7400" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="33ef44af-b130-5d44-a091-0b1cf8d019a2"
cat /etc/booster.yaml
  dhcp: on
universal: false
compression: zstd
mount_timeout: 0s
strip: true
extra_files: busybox,nvim
vconsole: true
enable_lvm: false
enable_mdraid: true
enable_zfs: false
cat /etc/fstab
# Static information about the filesystems.
# See fstab(5) for details.

# <file system> <dir> <type> <options> <dump> <pass>
# /dev/mapper/arch LABEL=blackrainow
UUID=726660ee-aab3-4025-aed5-3b58961f0616	/         	btrfs     	rw,compress-force=zstd,discard=async,subvol=arch	0 0

# /dev/nvme0n1p1
UUID=52A4-7400      	/boot     	vfat      	rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro	0 2

# /dev/mapper/arch LABEL=blackrainow
UUID=726660ee-aab3-4025-aed5-3b58961f0616	/home     	btrfs     	rw,compress-force=zstd,discard=async,subvol=home	0 0

# /dev/mapper/arch LABEL=blackrainow
UUID=726660ee-aab3-4025-aed5-3b58961f0616	/.snapshots	btrfs     	rw,compress-force=zstd,discard=async,subvol=.snapshots	0 0
cat /boot/loader/entries/arch.conf
title Arch Linux
linux /vmlinuz-linux
initrd /amd-ucode.img
initrd /booster-linux.img
options rd.luks.name=0d363017-ac94-458f-9fc0-803c6b114465=stary rd.luks.name=6e4c6f50-ab67-429e-b5d8-5ef53929f429=samsung root=UUID=726660ee-aab3-4025-aed5-3b58961f0616 rw rootflags=compress-force=zstd,discard=async,subvol=arch booster.log=debug,console
sudo btrfs fi usage /
Overall:
    Device size:		   2.28TiB
    Device allocated:		 878.07GiB
    Device unallocated:		   1.43TiB
    Device missing:		     0.00B
    Device slack:		     0.00B
    Used:			 703.68GiB
    Free (estimated):		   1.59TiB	(min: 903.00GiB)
    Free (statfs, df):		   1.59TiB
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		 512.00MiB	(used: 0.00B)
    Multiple profiles:		        no

Data,single: Size:872.01GiB, Used:699.79GiB (80.25%)
   /dev/mapper/samsung	 872.01GiB

Metadata,RAID1: Size:3.00GiB, Used:1.94GiB (64.75%)
   /dev/mapper/samsung	   3.00GiB
   /dev/mapper/stary	   3.00GiB

System,RAID1: Size:32.00MiB, Used:128.00KiB (0.39%)
   /dev/mapper/samsung	  32.00MiB
   /dev/mapper/stary	  32.00MiB

Unallocated:
   /dev/mapper/samsung	 987.67GiB
   /dev/mapper/stary	 473.89GiB

Here is log with error. Tried full raid0 and raid1 without problem. @ashakoor Could you try now?

It show error, but booted without problem.

The new solution I provided tries to mount the root partition until it succeeds. In the case of RAID0 once the first device is registered and booster tries to mount it fails as the array is not fully assembled. With the next block device, it succeeds.