unable to import pool by 0.6.2 after resilver by 2014-01-16 git version
AndCycle opened this issue · comments
tl;dr
after resilver by git version,
the pool is no longer be able to be imported by zol 0.6.2/freebsd/solaris,
report as pool metadata is corrupted
longer version
here is raw dd dump for analysis and some related info,
you can use the raw dd dump for detail diagnosis,
http://www.andcycle.idv.tw/~andcycle/tmp/tmp/zol/20140201/
I got similar issue as this thread discussed, I don't have label issue,
http://comments.gmane.org/gmane.linux.file-systems.zfs.user/12105
Gentoo
sys-kernel/gentoo-sources-3.12.7
sys-kernel/gentoo-sources-3.12.8
timeline
longtime ago
zfs-0.6.2
create zroot
create zmess
2013/12/19
create ztmp
2013/12/21
zfs-0.6.2-r1
2014/01/16
zfs-0.6.2-r2
2014/01/16
zfs-9999
2014/01/23
zroot resilvered
2014/01/27
ztmp resilvered
2014/01/30
create zstor-past
2014/02/01
zfs-0.6.2-r3
zroot state FAULTED, The pool metadata is corrupted.
ztmp state FAULTED, The pool metadata is corrupted.
zmess state ONLINE
zstor-past state ONLINE
2014/02/01
zfs-9999
zroot state ONLINE
ztmp state ONLINE
zmess state ONLINE
zstor-past state ONLINE
that's all, I think,
I just messed up my production server,
now I am trapped with git version,
tell me if there is anymore information needed
@AndCycle I'm going to try to recreate this on my own while I'm downloading your sda2.xz image but I'd like to clarify the steps that caused the corruption. Am I correct in saying that your "ztmp" pool was created to demonstrate the problem and that the problem was caused by running a "zpool scrub" on it? Did the pool import properly under 0.6.2/FreeBSD/Solaris prior to running the scrub? Also, could you please post the exact version of the code you're running with cat /sys/module/zfs/version
.
What did you do with the "ztmp" pool other than creating the 2 filesystems on it before running the scrub? Did you copy any files on to it?
@AndCycle I was able to reproduce the problem rather easily on my test system. No need for your image or anything else. I'm going to look into this issue now that I can reproduce it.
The problem is that in 1421c89 the size of a zbookmark_t was expanded. Unfortunately, the size of a zbookmark_t is, effectively, part of the expected on-disk format by the current ZFS code base. I've worked up a patch in dweeezil/zfs@c5223f5 which should only be considered a work-in-progress at this point. I have, however, tested it and it seems just fine. Read the commit message for details.
@dweeezil thanks for looking into this problem, probably will tryout your patch when I get free time :)
@dweeezil Nice work. This was on the short list of things that I wanted to debug before tagging a new Gentoo patch set. I planned to debug it before filing an issue, but @AndCycle beat me to reporting it and you appear to have beaten me to debugging it. :)
@AndCycle Nice description of the issue. Your report contains information that is far more useful than the information that I had to use. Last Monday, I discovered that HEAD broke backward compatibility. That put this on my short list of things to debug, but I had no idea what the trigger was and would not have spent much time on it until next week.
That being said, I am in a position to test and review this, but I might not have time until Monday.