btrfs allocation issue
tormath1 opened this issue · comments
Description
Recently noticed and I'm not sure really since when it is around but BTRFS allocation looks variable from one build to the other (at least on current Alpha and Beta):
Example on Beta-3941.1.0 (good behavior):
$ sudo btrfs fi usage /usr
Overall:
Device size: 1015.99MiB
Device allocated: 572.00MiB
Device unallocated: 443.99MiB
Device missing: 0.00B
Device slack: 0.00B
Used: 462.76MiB
Free (estimated): 546.67MiB (min: 546.67MiB)
Free (statfs, df): 442.94MiB
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 2.57MiB (used: 0.00B)
Multiple profiles: no
Data+Metadata,single: Size:568.00MiB, Used:462.75MiB (81.47%)
/dev/dm-0 568.00MiB
System,single: Size:4.00MiB, Used:4.00KiB (0.10%)
/dev/dm-0 4.00MiB
Unallocated:
/dev/dm-0 443.99MiB
While on a main
build:
$ sudo btrfs fi usage /usr
Overall:
Device size: 1015.99MiB
Device allocated: 684.00MiB
Device unallocated: 331.99MiB
Device missing: 0.00B
Device slack: 0.00B
Used: 462.88MiB
Free (estimated): 546.61MiB (min: 546.61MiB)
Free (statfs, df): 330.94MiB
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 2.51MiB (used: 0.00B)
Multiple profiles: no
Data+Metadata,single: Size:680.00MiB, Used:462.88MiB (68.07%)
/dev/dm-0 680.00MiB
System,single: Size:4.00MiB, Used:4.00KiB (0.10%)
/dev/dm-0 4.00MiB
Unallocated:
/dev/dm-0 331.99MiB
Allocated space is different.
Impact
The impact is that the filesystem appears to be more used than in reality:
14:15:07 File Size Used Avail Use% Type
14:15:07 -/usr 1016M 465M 443M 52% btrfs
14:15:07 +/usr 1016M 465M 331M 59% btrfs
Random behavior example with the last alpha (4012.0.0) release:
--- a/tmp/4011.0.0+nightly-20240624-2100-o4CAju
+++ b/tmp/4012.0.0-0AUpWi
@@ -1,5 +1,5 @@
File Size Used Avail Use% Type
/boot 127M 61M 66M 48% vfat
-/usr 1016M 468M 331M 59% btrfs
+/usr 1016M 468M 443M 52% btrfs
Similar thing can be observed after rerunning a Beta build.
Hello @tormath1, I will try to reproduce the issue in my env too, to take a better look.
Hello,
I have reproduced the behaviour in my environment using the Flatcar SDK to build a Flatcar image ~ 50% chance after running build_image and image_to_vm.sh.
But I cannot reproduce the issue manually, I have tried with this simple script:
#!/bin/bash
set -xe
umount /mnt || true
losetup -d /dev/loop6 || true
# create a loopback file of ~2GB
dd of=test.loop if=/dev/zero bs=1MB count=2048
losetup /dev/loop6 test.loop
# use the exact values from Flatcar layout
mkfs.btrfs --mixed -m single -d single --byte-count 1065345024 --label USR-A /dev/loop6
# mount the btrfs partition
mount -o relatime,seclabel,space_cache=v2,subvolid=5,subvol=/ /dev/loop6 /mnt
btrfs fi usage /mnt
# set the zstd compression
btrfs property set /mnt compression zstd
# write a ~690MB file
dd if=/dev/zero of=/mnt/test_file bs=1KB count=682490 && sync
# replace the ~690MB file with a ~459MB file
dd if=/dev/zero of=/mnt/test_file bs=1KB count=459490 && sync
# df / usage shows correctly
btrfs fi usage /mnt
# try to rebalance and remove the unused btrfs space
btrfs balance start -v -dusage=5 -musage=5 /mnt
# df / usage shows correctly again, no disparity between Free estimated and Free statsfs/df
btrfs fi usage /mnt
I think this issue is practically a non-issue, as from what I understood in the case of btrfs, the Linux syscalls used by df/statsfs are not properly showing in some conditions the actual correct values.
I will try to reproduce the disparity, but wanted to share this starting point if anyone else is also investigating.
I have tried a few times to create the image using this small fix and the sizes are converging:
diff --git a/build_library/disk_util b/build_library/disk_util
index f94317e3c1..32893c87c4 100755
--- a/build_library/disk_util
+++ b/build_library/disk_util
@@ -660,6 +660,7 @@ def ReadWriteSubvol(options, partition, disable_rw):
with PartitionLoop(options, partition) as loop_dev:
btrfs_mount = tempfile.mkdtemp()
Sudo(['mount', '-t', 'btrfs', loop_dev, btrfs_mount])
+ Sudo(['btrfs', 'balance', 'start', '-dusage=0', '-musage=0', btrfs_mount])
try:
Sudo(['btrfs', 'property', 'set', '-ts', btrfs_mount, 'ro', 'true' if disable_rw else 'false'])
finally:
@tormath1 I could not find the actual cause of this issue or reproduce it in isolation yet, but this patch should not do any harm, as the balance gets done right before making the partition readonly and the verity signing.
Adding the commit flatcar/scripts@95d8361 notes here for visibility:
Note that /usr is also a zstd compressed btrfs partition, so the output
of `df` free size and the actual free size after a file write for
example, will be very different, because the data in that file write has
a compression rate only definable after the file sync.
Unfortunately, there is no determinism in the btrfs file system case, because even if
you could in theory pre-compress with zstd the file before, and have an
idea about the size to be used, you still cannot really predict also the metadata
size for that file write.
While checking the journalctl output on the latest main, I observed that this warning appears 'nologreplay' is deprecated, use 'rescue=nologreplay' instead
. But there is no such mount option used in the flatcar/scripts repo as far as I know, the deprecated values were recently removed by flatcar/scripts@18265de.
@jepio do you have an idea from where the warning might come? I checked flatcar init / bootengine repos, but those also look fine.
/usr mount log :
Jul 01 16:38:11 localhost systemd[1]: Found device dev-mapper-usr.device - /dev/mapper/usr.
Jul 01 16:38:11 localhost systemd[1]: Mounting sysusr-usr.mount - /sysusr/usr...o
Jul 01 16:38:11 localhost systemd[1]: Finished verity-setup.service - Verity Setup for /dev/mapper/usr.
Jul 01 16:38:11 localhost kernel: BTRFS info (device dm-0): first mount of filesystem 60877fc8-37bb-4e8a-ae4f-aaea0a123cfa
Jul 01 16:38:11 localhost kernel: BTRFS info (device dm-0): using crc32c (crc32cc-intel) checksum algorithm
Jul 01 16:38:11 localhost kernel: BTRFS warning (device dm-0): 'nologreplay' is deprecated, use 'rescue=nologreplay' instead
Jul 01 16:38:11 localhost kernel: BTRFS info (device dm-0): disabling log replay at mount time
Jul 01 16:38:11 localhost kernel: BTRFS info (device dm-0): using free space treee
Jul 01 16:38:11 localhost systemd[1]: Mounted sysusr-usr.mount - /sysusr/usr.