zfs hangs on mounting

Question

zfs hangs on mounting

sanmai-NL opened this issue 11 years ago · comments

Yesterday I upgraded a ZFS pool ("WD_1") created with ZFS-on-Linux to version 5000 from version 28. Some other events could have overlapped this activity. I also set a property:
zfs set primarycache=metadata WD_1
I also issued a scrub run, I believe after the ZFS upgrade was completed, but I am not sure.

Since then, whenever I issue the command zfs mount WD_1, its execution never completes. Execution of various other commands also never completes from that point on.

I am using Arch Linux (3.12.1-3-ARCH #1 SMP PREEMPT Tue Nov 26 11:17:02 CET 2013 x86_64 GNU/Linux) with the ZFS packages (version 0.6.2_3.12.1-2) from the unofficial repo at http://demizerone.com/demz-repo-core/ .

This command seems successful even when the mount hangs:

zpool status -v WD_1
  pool: WD_1
 state: ONLINE
  scan: scrub repaired 0 in 0h59m with 0 errors on Wed Nov 27 12:55:37 2013
config:

    NAME                                                STATE     READ WRITE CKSUM
    WD_1                                                ONLINE       0     0     0
      usb-WD_My_Book_1140_574D43315431383031363538-0:0  ONLINE       0     0     0

errors: No known data errors

zpool iostat shows:

               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
WD_1        96.5G  2.62T      0      0  1.23K  1.76K
data         667G  1.16T      0      0    380    226
----------  -----  -----  -----  -----  -----  -----

dmesg reports:

162.366700] fuse init (API version 7.22)
[  240.208425] INFO: task txg_quiesce:4305 blocked for more than 120 seconds.
[  240.210011]       Tainted: P           O 3.12.1-3-ARCH #1
[  240.211917] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.213860] txg_quiesce     D 0000000000000002     0  4305      2 0x00000000
[  240.213867]  ffff88030a141d58 0000000000000046 00000000000144c0 ffff88030a141fd8
[  240.213872]  ffff88030a141fd8 00000000000144c0 ffff88030d97f0f0 ffff8803043585ce
[  240.213877]  ffff88030a141cf8 ffffffff8129a484 ffffffff810bffff ffff88030a141d40
[  240.213882] Call Trace:
[  240.213894]  [<ffffffff8129a484>] ? vsnprintf+0x214/0x680
[  240.213901]  [<ffffffff810bffff>] ? __do_adjtimex+0x17f/0x530
[  240.213907]  [<ffffffff81189357>] ? __kmalloc+0x247/0x2b0
[  240.213933]  [<ffffffffa05806fe>] ? kmem_alloc_debug+0x20e/0x500 [spl]
[  240.213940]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  240.213950]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  240.213955]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  240.213965]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  240.214001]  [<ffffffffa0691e0b>] txg_quiesce_thread+0x28b/0x400 [zfs]
[  240.214033]  [<ffffffffa0691b80>] ? txg_sync_thread+0x5c0/0x5c0 [zfs]
[  240.214043]  [<ffffffffa0584bca>] thread_generic_wrapper+0x7a/0x90 [spl]
[  240.214053]  [<ffffffffa0584b50>] ? __thread_exit+0xa0/0xa0 [spl]
[  240.214059]  [<ffffffff81084e80>] kthread+0xc0/0xd0
[  240.214064]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  240.214069]  [<ffffffff814fc27c>] ret_from_fork+0x7c/0xb0
[  240.214074]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  240.214078] INFO: task mount.zfs:4309 blocked for more than 120 seconds.
[  240.216102]       Tainted: P           O 3.12.1-3-ARCH #1
[  240.217171] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  240.218243] mount.zfs       D 0000000000000002     0  4309      1 0x00000004
[  240.218246]  ffff8803041239f0 0000000000000086 00000000000144c0 ffff880304123fd8
[  240.218248]  ffff880304123fd8 00000000000144c0 ffff88030a180000 ffff880304123960
[  240.218250]  ffffffff810902ab 000000001fc144c0 ffff8800cd84ab70 0000000000000001
[  240.218252] Call Trace:
[  240.218256]  [<ffffffff810902ab>] ? ttwu_stat+0x9b/0x110
[  240.218258]  [<ffffffff81094fdf>] ? try_to_wake_up+0x1ff/0x2d0
[  240.218261]  [<ffffffff81095102>] ? default_wake_function+0x12/0x20
[  240.218262]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  240.218268]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  240.218317]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  240.218321]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  240.218335]  [<ffffffffa069104b>] txg_wait_synced+0xcb/0x1b0 [zfs]
[  240.218347]  [<ffffffffa06d166e>] zil_replay_log_record+0xde/0x1a0 [zfs]
[  240.218357]  [<ffffffffa06d1d3e>] zil_parse+0x41e/0x7b0 [zfs]
[  240.218358]  [<ffffffff814f2849>] ? __schedule+0x3f9/0x950
[  240.218368]  [<ffffffffa06d0960>] ? zil_aitx_compare+0x20/0x20 [zfs]
[  240.218378]  [<ffffffffa06d1590>] ? zil_replay_error.isra.8+0xb0/0xb0 [zfs]
[  240.218389]  [<ffffffffa06d4768>] zil_replay+0xa8/0x110 [zfs]
[  240.218400]  [<ffffffffa06c2510>] zfs_sb_setup+0x140/0x150 [zfs]
[  240.218411]  [<ffffffffa06c2a22>] zfs_domount+0x252/0x2c0 [zfs]
[  240.218420]  [<ffffffffa06dec00>] ? zpl_mount+0x30/0x30 [zfs]
[  240.218429]  [<ffffffffa06dec0e>] zpl_fill_super+0xe/0x20 [zfs]
[  240.218432]  [<ffffffff811a765d>] mount_nodev+0x4d/0xb0
[  240.218441]  [<ffffffffa06debf5>] zpl_mount+0x25/0x30 [zfs]
[  240.218443]  [<ffffffff811a8279>] mount_fs+0x39/0x1b0
[  240.218447]  [<ffffffff811c23f7>] vfs_kern_mount+0x67/0x100
[  240.218449]  [<ffffffff811c4a9e>] do_mount+0x23e/0xa90
[  240.218453]  [<ffffffff8113979e>] ? __get_free_pages+0xe/0x50
[  240.218455]  [<ffffffff811c46e6>] ? copy_mount_options+0x36/0x170
[  240.218457]  [<ffffffff811c5373>] SyS_mount+0x83/0xc0
[  240.218459]  [<ffffffff814fc32d>] system_call_fastpath+0x1a/0x1f
[  315.157321] tun: Universal TUN/TAP device driver, 1.6
[  315.157326] tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
[  322.691508] Bridge firewalling registered
[  322.750072] r8169 0000:03:00.0 enp3s0: link down
[  322.750087] device enp3s0 entered promiscuous mode
[  322.750091] r8169 0000:03:00.0 enp3s0: link down
[  322.750125] IPv6: ADDRCONF(NETDEV_UP): enp3s0: link is not ready
[  322.759165] device tap0 entered promiscuous mode
[  322.759196] IPv6: ADDRCONF(NETDEV_UP): tap0: link is not ready
[  322.765615] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready
[  325.558443] r8169 0000:03:00.0 enp3s0: link up
[  325.558456] IPv6: ADDRCONF(NETDEV_CHANGE): enp3s0: link becomes ready
[  325.559336] br0: port 1(enp3s0) entered forwarding state
[  325.559347] br0: port 1(enp3s0) entered forwarding state
[  325.559372] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
[  340.577329] br0: port 1(enp3s0) entered forwarding state
[  360.141517] INFO: task txg_quiesce:4305 blocked for more than 120 seconds.
[  360.141526]       Tainted: P           O 3.12.1-3-ARCH #1
[  360.141528] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.141531] txg_quiesce     D 0000000000000002     0  4305      2 0x00000000
[  360.141538]  ffff88030a141d58 0000000000000046 00000000000144c0 ffff88030a141fd8
[  360.141543]  ffff88030a141fd8 00000000000144c0 ffff88030d97f0f0 ffff8803043585ce
[  360.141548]  ffff88030a141cf8 ffffffff8129a484 ffffffff810bffff ffff88030a141d40
[  360.141553] Call Trace:
[  360.141565]  [<ffffffff8129a484>] ? vsnprintf+0x214/0x680
[  360.141573]  [<ffffffff810bffff>] ? __do_adjtimex+0x17f/0x530
[  360.141579]  [<ffffffff81189357>] ? __kmalloc+0x247/0x2b0
[  360.141605]  [<ffffffffa05806fe>] ? kmem_alloc_debug+0x20e/0x500 [spl]
[  360.141613]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  360.141623]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  360.141628]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  360.141638]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  360.141674]  [<ffffffffa0691e0b>] txg_quiesce_thread+0x28b/0x400 [zfs]
[  360.141707]  [<ffffffffa0691b80>] ? txg_sync_thread+0x5c0/0x5c0 [zfs]
[  360.141717]  [<ffffffffa0584bca>] thread_generic_wrapper+0x7a/0x90 [spl]
[  360.141727]  [<ffffffffa0584b50>] ? __thread_exit+0xa0/0xa0 [spl]
[  360.141734]  [<ffffffff81084e80>] kthread+0xc0/0xd0
[  360.141739]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  360.141744]  [<ffffffff814fc27c>] ret_from_fork+0x7c/0xb0
[  360.141749]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  360.141754] INFO: task mount.zfs:4309 blocked for more than 120 seconds.
[  360.141757]       Tainted: P           O 3.12.1-3-ARCH #1
[  360.141759] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  360.141761] mount.zfs       D 0000000000000002     0  4309      1 0x00000004
[  360.141765]  ffff8803041239f0 0000000000000086 00000000000144c0 ffff880304123fd8
[  360.141770]  ffff880304123fd8 00000000000144c0 ffff88030a180000 ffff880304123960
[  360.141775]  ffffffff810902ab 000000001fc144c0 ffff8800cd84ab70 0000000000000001
[  360.141779] Call Trace:
[  360.141786]  [<ffffffff810902ab>] ? ttwu_stat+0x9b/0x110
[  360.141791]  [<ffffffff81094fdf>] ? try_to_wake_up+0x1ff/0x2d0
[  360.141796]  [<ffffffff81095102>] ? default_wake_function+0x12/0x20
[  360.141800]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  360.141810]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  360.141814]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  360.141824]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  360.141855]  [<ffffffffa069104b>] txg_wait_synced+0xcb/0x1b0 [zfs]
[  360.141879]  [<ffffffffa06d166e>] zil_replay_log_record+0xde/0x1a0 [zfs]
[  360.141902]  [<ffffffffa06d1d3e>] zil_parse+0x41e/0x7b0 [zfs]
[  360.141907]  [<ffffffff814f2849>] ? __schedule+0x3f9/0x950
[  360.141929]  [<ffffffffa06d0960>] ? zil_aitx_compare+0x20/0x20 [zfs]
[  360.141952]  [<ffffffffa06d1590>] ? zil_replay_error.isra.8+0xb0/0xb0 [zfs]
[  360.141977]  [<ffffffffa06d4768>] zil_replay+0xa8/0x110 [zfs]
[  360.142003]  [<ffffffffa06c2510>] zfs_sb_setup+0x140/0x150 [zfs]
[  360.142027]  [<ffffffffa06c2a22>] zfs_domount+0x252/0x2c0 [zfs]
[  360.142048]  [<ffffffffa06dec00>] ? zpl_mount+0x30/0x30 [zfs]
[  360.142068]  [<ffffffffa06dec0e>] zpl_fill_super+0xe/0x20 [zfs]
[  360.142074]  [<ffffffff811a765d>] mount_nodev+0x4d/0xb0
[  360.142095]  [<ffffffffa06debf5>] zpl_mount+0x25/0x30 [zfs]
[  360.142099]  [<ffffffff811a8279>] mount_fs+0x39/0x1b0
[  360.142106]  [<ffffffff811c23f7>] vfs_kern_mount+0x67/0x100
[  360.142111]  [<ffffffff811c4a9e>] do_mount+0x23e/0xa90
[  360.142118]  [<ffffffff8113979e>] ? __get_free_pages+0xe/0x50
[  360.142123]  [<ffffffff811c46e6>] ? copy_mount_options+0x36/0x170
[  360.142127]  [<ffffffff811c5373>] SyS_mount+0x83/0xc0
[  360.142132]  [<ffffffff814fc32d>] system_call_fastpath+0x1a/0x1f
[  480.064579] INFO: task txg_quiesce:4305 blocked for more than 120 seconds.
[  480.064586]       Tainted: P           O 3.12.1-3-ARCH #1
[  480.064588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  480.064591] txg_quiesce     D 0000000000000002     0  4305      2 0x00000000
[  480.064596]  ffff88030a141d58 0000000000000046 00000000000144c0 ffff88030a141fd8
[  480.064600]  ffff88030a141fd8 00000000000144c0 ffff88030d97f0f0 ffff8803043585ce
[  480.064604]  ffff88030a141cf8 ffffffff8129a484 ffffffff810bffff ffff88030a141d40
[  480.064607] Call Trace:
[  480.064618]  [<ffffffff8129a484>] ? vsnprintf+0x214/0x680
[  480.064623]  [<ffffffff810bffff>] ? __do_adjtimex+0x17f/0x530
[  480.064629]  [<ffffffff81189357>] ? __kmalloc+0x247/0x2b0
[  480.064652]  [<ffffffffa05806fe>] ? kmem_alloc_debug+0x20e/0x500 [spl]
[  480.064657]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  480.064665]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  480.064670]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  480.064677]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  480.064705]  [<ffffffffa0691e0b>] txg_quiesce_thread+0x28b/0x400 [zfs]
[  480.064730]  [<ffffffffa0691b80>] ? txg_sync_thread+0x5c0/0x5c0 [zfs]
[  480.064737]  [<ffffffffa0584bca>] thread_generic_wrapper+0x7a/0x90 [spl]
[  480.064744]  [<ffffffffa0584b50>] ? __thread_exit+0xa0/0xa0 [spl]
[  480.064749]  [<ffffffff81084e80>] kthread+0xc0/0xd0
[  480.064753]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  480.064757]  [<ffffffff814fc27c>] ret_from_fork+0x7c/0xb0
[  480.064761]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  480.064764] INFO: task mount.zfs:4309 blocked for more than 120 seconds.
[  480.064766]       Tainted: P           O 3.12.1-3-ARCH #1
[  480.064768] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  480.064769] mount.zfs       D 0000000000000002     0  4309      1 0x00000004
[  480.064773]  ffff8803041239f0 0000000000000086 00000000000144c0 ffff880304123fd8
[  480.064776]  ffff880304123fd8 00000000000144c0 ffff88030a180000 ffff880304123960
[  480.064780]  ffffffff810902ab 000000001fc144c0 ffff8800cd84ab70 0000000000000001
[  480.064783] Call Trace:
[  480.064787]  [<ffffffff810902ab>] ? ttwu_stat+0x9b/0x110
[  480.064791]  [<ffffffff81094fdf>] ? try_to_wake_up+0x1ff/0x2d0
[  480.064795]  [<ffffffff81095102>] ? default_wake_function+0x12/0x20
[  480.064798]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  480.064806]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  480.064809]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  480.064816]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  480.064839]  [<ffffffffa069104b>] txg_wait_synced+0xcb/0x1b0 [zfs]
[  480.064858]  [<ffffffffa06d166e>] zil_replay_log_record+0xde/0x1a0 [zfs]
[  480.064875]  [<ffffffffa06d1d3e>] zil_parse+0x41e/0x7b0 [zfs]
[  480.064878]  [<ffffffff814f2849>] ? __schedule+0x3f9/0x950
[  480.064895]  [<ffffffffa06d0960>] ? zil_aitx_compare+0x20/0x20 [zfs]
[  480.064912]  [<ffffffffa06d1590>] ? zil_replay_error.isra.8+0xb0/0xb0 [zfs]
[  480.064931]  [<ffffffffa06d4768>] zil_replay+0xa8/0x110 [zfs]
[  480.064951]  [<ffffffffa06c2510>] zfs_sb_setup+0x140/0x150 [zfs]
[  480.064969]  [<ffffffffa06c2a22>] zfs_domount+0x252/0x2c0 [zfs]
[  480.064985]  [<ffffffffa06dec00>] ? zpl_mount+0x30/0x30 [zfs]
[  480.065000]  [<ffffffffa06dec0e>] zpl_fill_super+0xe/0x20 [zfs]
[  480.065004]  [<ffffffff811a765d>] mount_nodev+0x4d/0xb0
[  480.065020]  [<ffffffffa06debf5>] zpl_mount+0x25/0x30 [zfs]
[  480.065023]  [<ffffffff811a8279>] mount_fs+0x39/0x1b0
[  480.065028]  [<ffffffff811c23f7>] vfs_kern_mount+0x67/0x100
[  480.065032]  [<ffffffff811c4a9e>] do_mount+0x23e/0xa90
[  480.065038]  [<ffffffff8113979e>] ? __get_free_pages+0xe/0x50
[  480.065041]  [<ffffffff811c46e6>] ? copy_mount_options+0x36/0x170
[  480.065045]  [<ffffffff811c5373>] SyS_mount+0x83/0xc0
[  480.065048]  [<ffffffff814fc32d>] system_call_fastpath+0x1a/0x1f
[  599.987710] INFO: task txg_quiesce:4305 blocked for more than 120 seconds.
[  599.987719]       Tainted: P           O 3.12.1-3-ARCH #1
[  599.987721] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  599.987725] txg_quiesce     D 0000000000000002     0  4305      2 0x00000000
[  599.987731]  ffff88030a141d58 0000000000000046 00000000000144c0 ffff88030a141fd8
[  599.987737]  ffff88030a141fd8 00000000000144c0 ffff88030d97f0f0 ffff8803043585ce
[  599.987742]  ffff88030a141cf8 ffffffff8129a484 ffffffff810bffff ffff88030a141d40
[  599.987747] Call Trace:
[  599.987760]  [<ffffffff8129a484>] ? vsnprintf+0x214/0x680
[  599.987768]  [<ffffffff810bffff>] ? __do_adjtimex+0x17f/0x530
[  599.987774]  [<ffffffff81189357>] ? __kmalloc+0x247/0x2b0
[  599.987801]  [<ffffffffa05806fe>] ? kmem_alloc_debug+0x20e/0x500 [spl]
[  599.987809]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  599.987819]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  599.987825]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  599.987835]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  599.987871]  [<ffffffffa0691e0b>] txg_quiesce_thread+0x28b/0x400 [zfs]
[  599.987904]  [<ffffffffa0691b80>] ? txg_sync_thread+0x5c0/0x5c0 [zfs]
[  599.987914]  [<ffffffffa0584bca>] thread_generic_wrapper+0x7a/0x90 [spl]
[  599.987924]  [<ffffffffa0584b50>] ? __thread_exit+0xa0/0xa0 [spl]
[  599.987931]  [<ffffffff81084e80>] kthread+0xc0/0xd0
[  599.987936]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  599.987941]  [<ffffffff814fc27c>] ret_from_fork+0x7c/0xb0
[  599.987947]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  599.987951] INFO: task mount.zfs:4309 blocked for more than 120 seconds.
[  599.987954]       Tainted: P           O 3.12.1-3-ARCH #1
[  599.987956] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  599.987958] mount.zfs       D 0000000000000002     0  4309      1 0x00000004
[  599.987962]  ffff8803041239f0 0000000000000086 00000000000144c0 ffff880304123fd8
[  599.987967]  ffff880304123fd8 00000000000144c0 ffff88030a180000 ffff880304123960
[  599.987972]  ffffffff810902ab 000000001fc144c0 ffff8800cd84ab70 0000000000000001
[  599.987976] Call Trace:
[  599.987983]  [<ffffffff810902ab>] ? ttwu_stat+0x9b/0x110
[  599.987988]  [<ffffffff81094fdf>] ? try_to_wake_up+0x1ff/0x2d0
[  599.987993]  [<ffffffff81095102>] ? default_wake_function+0x12/0x20
[  599.987997]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  599.988007]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  599.988011]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  599.988021]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  599.988052]  [<ffffffffa069104b>] txg_wait_synced+0xcb/0x1b0 [zfs]
[  599.988077]  [<ffffffffa06d166e>] zil_replay_log_record+0xde/0x1a0 [zfs]
[  599.988100]  [<ffffffffa06d1d3e>] zil_parse+0x41e/0x7b0 [zfs]
[  599.988104]  [<ffffffff814f2849>] ? __schedule+0x3f9/0x950
[  599.988126]  [<ffffffffa06d0960>] ? zil_aitx_compare+0x20/0x20 [zfs]
[  599.988149]  [<ffffffffa06d1590>] ? zil_replay_error.isra.8+0xb0/0xb0 [zfs]
[  599.988175]  [<ffffffffa06d4768>] zil_replay+0xa8/0x110 [zfs]
[  599.988210]  [<ffffffffa06c2510>] zfs_sb_setup+0x140/0x150 [zfs]
[  599.988246]  [<ffffffffa06c2a22>] zfs_domount+0x252/0x2c0 [zfs]
[  599.988278]  [<ffffffffa06dec00>] ? zpl_mount+0x30/0x30 [zfs]
[  599.988310]  [<ffffffffa06dec0e>] zpl_fill_super+0xe/0x20 [zfs]
[  599.988319]  [<ffffffff811a765d>] mount_nodev+0x4d/0xb0
[  599.988351]  [<ffffffffa06debf5>] zpl_mount+0x25/0x30 [zfs]
[  599.988359]  [<ffffffff811a8279>] mount_fs+0x39/0x1b0
[  599.988369]  [<ffffffff811c23f7>] vfs_kern_mount+0x67/0x100
[  599.988377]  [<ffffffff811c4a9e>] do_mount+0x23e/0xa90
[  599.988390]  [<ffffffff8113979e>] ? __get_free_pages+0xe/0x50
[  599.988397]  [<ffffffff811c46e6>] ? copy_mount_options+0x36/0x170
[  599.988404]  [<ffffffff811c5373>] SyS_mount+0x83/0xc0
[  599.988411]  [<ffffffff814fc32d>] system_call_fastpath+0x1a/0x1f
[  719.910872] INFO: task txg_quiesce:4305 blocked for more than 120 seconds.
[  719.910880]       Tainted: P           O 3.12.1-3-ARCH #1
[  719.910882] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  719.910885] txg_quiesce     D 0000000000000002     0  4305      2 0x00000000
[  719.910892]  ffff88030a141d58 0000000000000046 00000000000144c0 ffff88030a141fd8
[  719.910898]  ffff88030a141fd8 00000000000144c0 ffff88030d97f0f0 ffff8803043585ce
[  719.910902]  ffff88030a141cf8 ffffffff8129a484 ffffffff810bffff ffff88030a141d40
[  719.910907] Call Trace:
[  719.910919]  [<ffffffff8129a484>] ? vsnprintf+0x214/0x680
[  719.910927]  [<ffffffff810bffff>] ? __do_adjtimex+0x17f/0x530
[  719.910933]  [<ffffffff81189357>] ? __kmalloc+0x247/0x2b0
[  719.910961]  [<ffffffffa05806fe>] ? kmem_alloc_debug+0x20e/0x500 [spl]
[  719.910969]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  719.910979]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  719.910985]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  719.910995]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  719.911031]  [<ffffffffa0691e0b>] txg_quiesce_thread+0x28b/0x400 [zfs]
[  719.911064]  [<ffffffffa0691b80>] ? txg_sync_thread+0x5c0/0x5c0 [zfs]
[  719.911074]  [<ffffffffa0584bca>] thread_generic_wrapper+0x7a/0x90 [spl]
[  719.911084]  [<ffffffffa0584b50>] ? __thread_exit+0xa0/0xa0 [spl]
[  719.911091]  [<ffffffff81084e80>] kthread+0xc0/0xd0
[  719.911096]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  719.911101]  [<ffffffff814fc27c>] ret_from_fork+0x7c/0xb0
[  719.911106]  [<ffffffff81084dc0>] ? kthread_create_on_node+0x120/0x120
[  719.911111] INFO: task mount.zfs:4309 blocked for more than 120 seconds.
[  719.911114]       Tainted: P           O 3.12.1-3-ARCH #1
[  719.911116] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  719.911118] mount.zfs       D 0000000000000002     0  4309      1 0x00000004
[  719.911122]  ffff8803041239f0 0000000000000086 00000000000144c0 ffff880304123fd8
[  719.911127]  ffff880304123fd8 00000000000144c0 ffff88030a180000 ffff880304123960
[  719.911132]  ffffffff810902ab 000000001fc144c0 ffff8800cd84ab70 0000000000000001
[  719.911136] Call Trace:
[  719.911143]  [<ffffffff810902ab>] ? ttwu_stat+0x9b/0x110
[  719.911148]  [<ffffffff81094fdf>] ? try_to_wake_up+0x1ff/0x2d0
[  719.911153]  [<ffffffff81095102>] ? default_wake_function+0x12/0x20
[  719.911157]  [<ffffffff814f2dc9>] schedule+0x29/0x70
[  719.911167]  [<ffffffffa058c42d>] cv_wait_common+0x10d/0x1c0 [spl]
[  719.911171]  [<ffffffff81085ca0>] ? wake_up_atomic_t+0x30/0x30
[  719.911181]  [<ffffffffa058c4f5>] __cv_wait+0x15/0x20 [spl]
[  719.911212]  [<ffffffffa069104b>] txg_wait_synced+0xcb/0x1b0 [zfs]
[  719.911237]  [<ffffffffa06d166e>] zil_replay_log_record+0xde/0x1a0 [zfs]
[  719.911260]  [<ffffffffa06d1d3e>] zil_parse+0x41e/0x7b0 [zfs]
[  719.911264]  [<ffffffff814f2849>] ? __schedule+0x3f9/0x950
[  719.911287]  [<ffffffffa06d0960>] ? zil_aitx_compare+0x20/0x20 [zfs]
[  719.911309]  [<ffffffffa06d1590>] ? zil_replay_error.isra.8+0xb0/0xb0 [zfs]
[  719.911334]  [<ffffffffa06d4768>] zil_replay+0xa8/0x110 [zfs]
[  719.911360]  [<ffffffffa06c2510>] zfs_sb_setup+0x140/0x150 [zfs]
[  719.911385]  [<ffffffffa06c2a22>] zfs_domount+0x252/0x2c0 [zfs]
[  719.911406]  [<ffffffffa06dec00>] ? zpl_mount+0x30/0x30 [zfs]
[  719.911426]  [<ffffffffa06dec0e>] zpl_fill_super+0xe/0x20 [zfs]
[  719.911432]  [<ffffffff811a765d>] mount_nodev+0x4d/0xb0
[  719.911453]  [<ffffffffa06debf5>] zpl_mount+0x25/0x30 [zfs]
[  719.911457]  [<ffffffff811a8279>] mount_fs+0x39/0x1b0
[  719.911465]  [<ffffffff811c23f7>] vfs_kern_mount+0x67/0x100
[  719.911470]  [<ffffffff811c4a9e>] do_mount+0x23e/0xa90
[  719.911478]  [<ffffffff8113979e>] ? __get_free_pages+0xe/0x50
[  719.911483]  [<ffffffff811c46e6>] ? copy_mount_options+0x36/0x170
[  719.911487]  [<ffffffff811c5373>] SyS_mount+0x83/0xc0
[  719.911492]  [<ffffffff814fc32d>] system_call_fastpath+0x1a/0x1f
[  921.285080] perf samples too long (2518 > 2500), lowering kernel.perf_event_max_sample_rate to 50100

Can anybody help me salvage this pool?

Brian Behlendorf · Answer 1 · Wed Dec 04 2013 01:45:34 GMT+0800 (China Standard Time)

@sanmai-NL Have you tried just waiting for the mount to complete? According to the stack it's attempting to replay the ZIL and waiting on the IO. Do you see IO activity to the drives while the mount is hanging?

You can disable the ZIL reply at mount by setting the zil_replay_disable=1 mount option when you load your modules. This will cause you to loss the last few seconds of data written to your pool but should get everything back online. After everything has been successfully mounted once you'll want to remove this mount option.

Sander Maijers · Answer 2 · Wed Dec 04 2013 04:37:11 GMT+0800 (China Standard Time)

I have waited for a long time for the mount to complete, at one time for more than half an hour, as far I can remember.

I did report the zfs iostat output. Can you suggest a better/alternative way to find out whether there is such IO activity?

Thanks for the advice, now I can mount the zpool. However, I experience another hang in the midst of a recursive chown of a directory in the filesystem. I'm now trying to move data out of this pool.

Brian Behlendorf · Answer 3 · Wed Dec 04 2013 07:55:37 GMT+0800 (China Standard Time)

@sanmai-NL Looks like I missed the zfs iostat output. Another way to watch the IO is simply with the standard Linux iostat utility. If you see activity on any of the ZFS devices it's usually best just to let it sit and finish.

If you're trying to pull all data out the pool you may want to import the pool read-only.

Sander Maijers · Answer 4 · Fri Dec 06 2013 01:08:41 GMT+0800 (China Standard Time)

Thanks for the additional advice @behlendorf. I salvaged my data and destroyed and recreated the zpool. @dweeezil, I hope improvements can be made so that deadlocks will become more uncommon in the future.

John Wiegley · Answer 5 · Tue Jan 07 2014 05:10:41 GMT+0800 (China Standard Time)

I can also confirm that turning ZIL replay off allowed me to mount some filesystems that otherwise hung in the kernel exactly as @sanmai-NL described. Is there any reason why this issue has been closed? This is still happening with the latest version of ZOL.

John Wiegley · Answer 6 · Tue Jan 07 2014 05:14:59 GMT+0800 (China Standard Time)

Here is all that it takes to reproduce this on my fileserver:

root@titan:~# cd /tank/Data/Downloads/
root@titan:/tank/Data/Downloads# ls -ltra
total 89
-rw-r--r--  1 johnw dialout     0 Sep 21  2012 .localized
drwxrwxrwt  3 johnw dialout     3 Sep 26  2012 .TemporaryItems
drwxrwxrwt  3 johnw dialout     3 Apr 29  2013 .Trashes
-rw-r--r--  1 johnw dialout 12292 Jun 15  2013 .DS_Store
-rwxr-xr-x  1 johnw dialout   141 Nov  6 04:12 frob
drwx------  2 johnw dialout    15 Nov  7 20:14 .fseventsd
drwxr-xr-x  5 johnw dialout     8 Nov  8 03:33 .
drwxr-xr-x 15 johnw dialout    18 Dec 16 11:46 ..
dr-xr-xr-x  1 root  root        0 Jan  6 15:11 .zfs
root@titan:/tank/Data/Downloads# chown btsync .
chown: changing ownership of `.': No such file or directory
root@titan:/tank/Data/Downloads# ls -la
total 89
drwxr-xr-x  5 btsync dialout     8 Nov  8 03:33 .
drwxr-xr-x 15 johnw  dialout    18 Dec 16 11:46 ..
-rw-r--r--  1 johnw  dialout 12292 Jun 15  2013 .DS_Store
-rwxr-xr-x  1 johnw  dialout   141 Nov  6 04:12 frob
drwx------  2 johnw  dialout    15 Nov  7 20:14 .fseventsd
-rw-r--r--  1 johnw  dialout     0 Sep 21  2012 .localized
drwxrwxrwt  3 johnw  dialout     3 Sep 26  2012 .TemporaryItems
drwxrwxrwt  3 johnw  dialout     3 Apr 29  2013 .Trashes
dr-xr-xr-x  1 root   root        0 Jan  6 15:11 .zfs
root@titan:/tank/Data/Downloads# chown btsync /tank/Data/Downloads
ls -la

The server is now out to lunch, and must be force restarted, and the ZIL disabled in order to mount the filesystem.

Tim Chase · Answer 7 · Tue Jan 07 2014 05:56:05 GMT+0800 (China Standard Time)

@jwiegley I gather that after rebooting with replay disabled, a subsequent chown still hangs? Could you please find the inode number of the Downloads directory with ls -di /tank/Data/Downloads (it should be 4) and post the output of zdb -ddddd tank/Data/Downloads <inum> (where is likely 4). Is xattr=sa set on this filesystem? I have a feeling this is some sort of SA corruption. Do you get any ZFS-related messages in your syslog/dmesg?

John Wiegley · Answer 8 · Tue Jan 07 2014 08:47:08 GMT+0800 (China Standard Time)

Ack, I should have kept that filesystem, I have since destroyed it in order to move forward with my use of the filesystem. I'm using xattr=on, if that helps. ls -di does show a 4.

linuxfood · Answer 9 · Wed Jan 08 2014 08:27:04 GMT+0800 (China Standard Time)

I've also got a pool with xattr=on, and I managed to capture the output of zdb -ddddd Cargobay/TestFS 9 on a directory for which a chown is currently hung. System is a fresh Arch install with kernel 3.12.6-1-ARCH and the corresponding builds from http://demizerone.com. (version 0.6.2_3.12.6-1 built on 2013-12-26)

EDIT: The pool came from ZEVO, but I have not yet attempted an upgrade.
EDIT2: The TestFS filesystem does fail to mount at this point.

It's worth noting that often times commands will return 'File not found' even for things that are plainly there. I don't think I've seen an mount issues attributable to this. Here's the paste:

Dataset Cargobay/TestFS [ZPL], ID 5120, cr_txg 2041990, 356K, 56 objects, rootbp DVA[0]=<0:16001bb000:1000> DVA[1]=<0:74000e5000:1000> [L0 DMU objset] fletcher4 lzjb LE contiguous unique double size=800L/200P birth=7887399L/7887399P fill=56 cksum=17531407fd:7b70c9644da:15e8c67a8bf87:2bef7d78cd97c3

Object  lvl   iblk   dblk  dsize  lsize   %full  type
     9    1    16K    512     8K    512  100.00  ZFS directory
                                    136   bonus  System attributes
dnode flags: USED_BYTES USERUSED_ACCOUNTED 
dnode maxblkid: 0
path    /.Trashes
uid     501
gid     20
atime   Mon Dec 31 21:10:20 2012
mtime   Sun Jan  5 17:46:18 2014
ctime   Sun Jan  5 17:46:18 2014
crtime  Mon Dec 31 21:10:20 2012
gen 2041991
mode    41777
size    3
parent  4
links   3
pflags  0
microzap: 512 bytes, 1 entries

    501 = 52 (type: Directory)
Indirect blocks:
           0 L0 0:3e890f4000:1000 200L/200P F=1 B=7872338/7872338

    segment [0000000000000000, 0000000000000200) size   512

Here's the output of /proc/spl/kstat/zfs/dmu_tx

3 1 0x01 12 576 4545263761 4742202820528
name                            type data
dmu_tx_assigned                 4    71577
dmu_tx_delay                    4    0
dmu_tx_error                    4    0
dmu_tx_suspended                4    0
dmu_tx_group                    4    2
dmu_tx_how                      4    0
dmu_tx_memory_reserve           4    0
dmu_tx_memory_reclaim           4    0
dmu_tx_memory_inflight          4    0
dmu_tx_dirty_throttle           4    0
dmu_tx_write_limit              4    0
dmu_tx_quota                    4    0

This pool came from ZEVO (which I'm beginning to gather has some "interesting" quirks), but it's been far too long to recall how it was created.

Interestingly the TestFS filesystem is actually empty (I forget why I created it ages ago), but I have another filesystem that also exhibits this problem but I was able to rsync off the contents at some point to a new pool created under ZoL. I might have enough disk available to shuffle all my pools around, but I'm not confident in that.

I'm happy to leave this system in this state if you want me to collect anything else.
Interestingly, the stacks in dmesg make it look like txg_quiesce is the culprit here.

Tim Chase · Answer 10 · Wed Jan 08 2014 13:18:02 GMT+0800 (China Standard Time)

@linuxfood Thanks for the information. I took a peek at the https://github.com/demizer/archzfs repository and as best as I can tell, it's a stock 0.6.2 code base so none of the myriad of new things in the current master branch should come into play. If you could, please, do hang on to this pool. I've got a few questions, too:

Do you have any idea what version of GCC was used to compile the package? Am I correct in saying that the hang happens when doing a chown on the ".Trashes" directory? If so, what is the exact command that causes the hang? What are the circumstances that led to this hanging behavior? Did you recently upgrade to the 0.6.2_3.12.6-1 version of the package from something earlier?

linuxfood · Answer 11 · Wed Jan 08 2014 13:29:56 GMT+0800 (China Standard Time)

@dweeezil it was probably built with GCC 4.8.X (likely .2) because that's what I've got on my Arch machine, but I'll drop a line to the maintainer and see. No upgrades were done of the archzfs package, this was a fresh setup, though the pool was created from the last release ZEVO.

The exact command was chown -R <username>:<groupname> .Trash
I was able to run chown <username>:<groupname> . for the parent, though I doubt it's something about the recursive changes.

I'm happy to hang onto the pool, and can probably make it available for manual inspection as well.

John Wiegley · Answer 12 · Wed Jan 08 2014 15:28:54 GMT+0800 (China Standard Time)

Just to note, my pool came from ZEVO as well. Today I wiped the pool entirely and recreated all the filesystems, so I'll be back if the hanging behavior continues with the all-ZOL solution.

Sander Maijers · Answer 13 · Wed Jan 08 2014 18:30:53 GMT+0800 (China Standard Time)

Hi,

And my late problematic pool was often used with ZEVO as well, though I
created it with ZOL as far as I remember.

On 08-01-14 08:29, John Wiegley wrote:

Just to note, my pool came from ZEVO as well. Today I wiped the pool
entirely and recreated all the filesystems, so I'll be back if the
hanging behavior continues with the all-ZOL solution.

—
Reply to this email directly or view it on GitHub
#1911 (comment).

Tim Chase · Answer 14 · Wed Jan 08 2014 21:27:47 GMT+0800 (China Standard Time)

Could someone with access to ZEVO please make a small pool (with ZEVO) and make it available for downloading. I'm thinking something like the following would suffice for the moment:

truncate -s 100m /tank.img
zpool create tank /tank.img
zfs create tank/fs
echo file1 > /tank/fs/file1
echo file2a > /tank/fs/file2a
ln -s this_is_a_symlink /tank/fs/symlink
chown 1234:4321 /tank/fs/file2a
chmod 505 /tank/fs/file2a
zpool export tank
gzip -9 /tank.img

I'd like to examine the resulting pool and see whether anything looks out-of-sorts.

linuxfood · Answer 15 · Thu Jan 09 2014 04:15:11 GMT+0800 (China Standard Time)

@dweeezil I'll see what I can do about getting to this tonight.

Brian Behlendorf · Answer 16 · Fri Jan 10 2014 02:52:01 GMT+0800 (China Standard Time)

I've reopened the issue until we can get to the root cause.

linuxfood · Answer 17 · Fri Jan 10 2014 15:34:07 GMT+0800 (China Standard Time)

Sorry for the delay, I didn't manage to have access to a machine with Zevo on it last night, but I got it tonight, here's a link to the image: http://bcs.testing.linuxfood.net/static/tank.img.gz

I've also confirmed that it behaves the same as my pool with actual data in it, so reproducing the issue should be trivial.

Tim Chase · Answer 18 · Fri Jan 10 2014 22:20:52 GMT+0800 (China Standard Time)

@linuxfood Thanks. I should have this tracked down pretty quickly. Right away, I can see that ZEVO uses a rather "interesting" set of SAs.

Tim Chase · Answer 19 · Sat Jan 11 2014 05:31:26 GMT+0800 (China Standard Time)

The problem is that ZEVO is creating files that have neither an "external ACL" (ZNODE_ACL) nor a new-style DACL_ACES SA. The owner/group-changing logic within ZFS demands that there be one of these (depending on the ZPL version) in order to function properly.

A workaround, interestingly enough, is to first chmod the files with their existing perms which fixes the problem by creating the DACL_ACES SA and will allow subsequent chown/chgrp to work properly.

I don't know anything about ZEVO, but with this behavior, I'd guess they disabled some (all?) of the ACL-related code in ZFS and are relying strictly on the "simple" SAs such as ZPL_UID, ZPL_GID, etc.

@linuxfood I have worked up a little hack in dweeezil/zfs@cb1fae6 which does seem to work around the problem. Normal disclaimer applies (try at your own risk, etc.).

linuxfood · Answer 20 · Sat Jan 11 2014 06:06:50 GMT+0800 (China Standard Time)

@dweeezil very interesting. Is there any reason to apply this change if I can just do a "noop chmod"? If the work around is sufficient, then I'm happy to do that work around and then wait for the revised code. Regardless, as soon as I can validate that my data is safe, I'll be doing the great data shuffle and recreating the pools anyways so I can take advantage of stuff like lz4.

Tim Chase · Answer 21 · Sat Jan 11 2014 11:59:40 GMT+0800 (China Standard Time)

@linuxfood This is a bit of a tough call. ZEVO is clearly doing something that no other ZFS implementation would likely have done, however, it seems reasonable that an implementation ought to be able to function without those ACL-related SAs. After all, ZoL can do a chmod perfectly find without the ACL-related SAs (at least it seems so). I'm going to study the code paths involved a bit more and, depending on the feedback I get, may issue a pull request for this.

The main concern I've got about hacking up a no-op chmod script to "upgrade" the SAs is that Linux doesn't have a lchmod(2) call but it does have a lchown(2) call. There would always be a chance of an lchown(2) call in the future hanging the system. For that reason, I'd likely either simply rsync the pool or use my hackish patch.

One other note I'll make is that the current problem does not actually corrupt the pool, however, it does create ZIL records which cannot be replayed when the system is mounted after a reboot. That's why the only workaround is to disable replay.

Brian Behlendorf · Answer 22 · Tue Jan 14 2014 03:20:00 GMT+0800 (China Standard Time)

@dweeezil Nice analysis. I wasn't aware that the ZEVO implementation dropped the ACL-related SAs entirely.

This could certainly lead to problems because even though those ACLs aren't exposed under Linux internally the code makes heavy use of them. In fact all of the normal unix permission bits are stored as an NFS ACL and then converted in to the usual mode bits. If those ACLs don't exist it could lead to some strange issues.

Originally I thought about disabling them on Linux, like ZEVO did, but in the end decided against it for compatibility reasons like this. But since pools like this now do seem to exist we should probably add code to handle this case as cleanly as possible and explain why that code is there. See my comments in your proposed fix.