codesign for booting
lundman opened this issue · comments
I went to update the ZFS-on-boot instructions https://openzfsonosx.org/wiki/ZFS_on_Boot and attempted to boot Catalina, but we do have an issue with mounting root:
Loaded module v1.9.2-4-d4889d276e, ZFS pool version 5000, ZFS filesystem version 5
zfs_boot_publish_bootfs: publishing bootfs [rpool/ROOT/Catalina]
zfs_boot_publish_bootfs done
Got boot device = IOService:/IOResources/net_lundman_zfs_zvol/rpool/ZFSDatasetProxy/IOBlockStorageDriver/rpool Media/ZFSDatasetScheme/Catalina@1
BSD root: disk6s1, major 1, minor 23
hfs_ValidateHFSPlusVolumeHeader: unknown Volume Signature : 0
hfs_mount: hfs_mountfs returned error=22 for device unknown-dev
ZFS: zfs_vfs_mountroot
Setting readonly
Not booted from APFS, skipping apfs.util
boot _checkBrokenSignatureWithTeamIDFatal(LazyPath *, struct cs_blob *): no registered daemon port for check_broken_signature_with_teamid_fatal
mac_vnode_check_signature: /Library/Filesystems/zfs.fs/Contents/Resouces/mount_zfs: code signature validation failed fatally:
When validating /Library/Filesystems/zfs.fs/Contents/Resources/mount_zfs:
The code contains a Team ID, but validating its signature failed.
Please check you system log.proc 5: load code signature error 4 for file "mount_zfs"
port is not ready for callouts
mount: /: Killed
If mount_zfs
is not codesigned, it has the same complaints about the libraries. So an unsigned, static, zfs
binary named zfs_mount
is the way to go.
It does seem to stall a bit later due to memory concerns though.
OK, takes about 20mins to boot Catalina with 2G of RAM, the spindump during boot looks like:
http://www.lundman.net/boot-spindump.txt
It seems that any paging falls apart, unsure why yet - if you spot something peculiar, mention it.
With 4GB RAM, it does boot to UI after about 20mins. At which point, I could disable mds
and launchctl remove com.apple.appstoreagent
, zfs set sync=disabled rpool/ROOT/Catalina
to give it a bit less IO.
Logging into to GUI to test, it appears somewhere along the line we managed to fix the font problem:
(Apologies about the hacky 'photoshop' job)
If we can solve the cause of it running in molasses, it would be potentially possible to run the OS this way.
And also a spindump waiting long enough for it to be idle. It is still sluggish, and kernel_task is pretty busy - the trick will be to find where.
and flamegraph while it is idle:
http://www.lundman.net/zfsboot.svg
arc_reclaim_thread()
shows a bit too much, I would expect that to be mostly idle (even if low on memory)
zfs_vnop_pagein()
also taking a lot of time.
arc_reclaim_thread()
seems to be looping here:
} else if (evicted >= SPA_MAXBLOCKSIZE * ARCSTAT(arc_reclaim_waiters_count)) {
// we evicted plenty of buffers, so let's wake up
// all the waiters rather than having them stall
ARCSTAT_BUMP(arc_reclaim_waiters_early_broadcast);
cv_broadcast(&arc_reclaim_waiters_cv);
kstat.zfs.misc.arcstats.arc_reclaim_waiters_cnt: 0
kstat.zfs.misc.arcstats.arc_reclaim_waiters_cur: 0
kstat.zfs.misc.arcstats.arc_reclaim_waiters_sig: 0
kstat.zfs.misc.arcstats.arc_reclaim_waiters_bcst: 244208
kstat.zfs.misc.arcstats.arc_reclaim_waiters_tout: 0
Just pulling out the zfs calls, and sorting on frequency:
31 lz4_decompress_zfs
31 zap_lockdir
32 dmu_read_uio_dnode
35 dmu_buf_hold
35 zfs_vnop_read
36 zfs_read
43 -
48 zio_done
55 __zio_execute
61 vdev_mirror_io_start
65 dbuf_hold
68 dbuf_hold_impl
73 __dbuf_hold_impl
73 vdev_disk_io_start
80 buf_strategy_iokit
85 arc_read
121 dmu_read_impl
127 dmu_read
135 dbuf_read
135 zio_vdev_io_start
141 zio_nowait
184 zfs_vnop_pagein
209 dmu_buf_hold_array_by_dnode
217 zio_wait
Note to self,
module/zfs/spa_config.c
spa_write_cachefile()
will panic during boot, due to vn_open()
being called before rootvnode
has been set.
Ok, the boot issues have been fixed, except for the performance one. There is a signed PKG on wiki if anyone wants to try ZFS on Boot.