openzfsonosx / zfs

OpenZFS on OS X

Home Page:https://openzfsonosx.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Detaching 1/N resilvering disks caused remaining N-1 resilver to instantly succeed without completing

josephvusich opened this issue · comments

TLDR: I attached a 3rd mirror to every VDEV. The new disk attached to the special VDEV was clearly bad (write errors) so I detached it. The remaining new drives instantly "completed" resilvering without error, even though the resilver should have continued for hours. Confirmed an issue by starting a manual scrub that identified millions of CKSUM errors on the disks that were incorrectly marked as resilvered.

More detailed walkthrough below. Note that mirror-0 and mirror-2 are HDDs, and special mirror-1 is comprised of SSDs.

OS/ZFS version

$ zfs version
zfs-1.9.4-0
zfs-kmod-1.9.4-0

$ sw_vers                       
ProductName:    Mac OS X
ProductVersion: 10.15.6
BuildVersion:   19G2021

Initial pool layout

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	  mirror-0     ONLINE       0     0     0
	    media-0-0  ONLINE       0     0     0
	    media-0-1  ONLINE       0     0     0
	  mirror-2     ONLINE       0     0     0
	    media-2-0  ONLINE       0     0     0
	    media-2-1  ONLINE       0     0     0
	special	
	  mirror-1     ONLINE       0     0     0
	    media-1-0  ONLINE       0     0     0
	    media-1-1  ONLINE       0     0     0

Adding mirrors

zpool attach tank media-1-0 /dev/disk8
zpool attach tank media-0-0 /dev/disk9
zpool attach tank media-2-0 /dev/disk10

One bad disk identified during resilver

(resilvering)

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	  mirror-0     ONLINE       0     0     0
	    media-0-0  ONLINE       0     0     0
	    media-0-1  ONLINE       0     0     0
	    disk9      ONLINE       0     0     0
	  mirror-2     ONLINE       0     0     0
	    media-2-0  ONLINE       0     0     0
	    media-2-1  ONLINE       0     0     0
	    disk10     ONLINE       0     0     0
	special	
	  mirror-1     ONLINE       0     0     0
	    media-1-0  ONLINE       0     0     0
	    media-1-1  ONLINE       0     0     0
	    disk8      ONLINE       0 4.08M   326

Detach bad disk

zpool detach tank /dev/disk8

ZFS stops resilver for remaining disks without error

(no resilver in progress)

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	  mirror-0     ONLINE       0     0     0
	    media-0-0  ONLINE       0     0     0
	    media-0-1  ONLINE       0     0     0
	    disk9      ONLINE       0     0     0
	  mirror-2     ONLINE       0     0     0
	    media-2-0  ONLINE       0     0     0
	    media-2-1  ONLINE       0     0     0
	    disk10     ONLINE       0     0     0
	special	
	  mirror-1     ONLINE       0     0     0
	    media-1-0  ONLINE       0     0     0
	    media-1-1  ONLINE       0     0     0

Start scrub

zpool scrub tank

The resilver was clearly not finished

(scrub in progress)

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	  mirror-0     ONLINE       0     0     0
	    media-0-0  ONLINE       0     0     0
	    media-0-1  ONLINE       0     0     0
	    disk9      ONLINE       0     0 3.08M
	  mirror-2     ONLINE       0     0     0
	    media-2-0  ONLINE       0     0     0
	    media-2-1  ONLINE       0     0     0
	    disk10     ONLINE       0     0 2.75M
	special	
	  mirror-1     ONLINE       0     0     0
	    media-1-0  ONLINE       0     0     0
	    media-1-1  ONLINE       0     0     0