ZFS 2.2.4 cause process to hang on file access
severgun opened this issue · comments
Sergey Vergun commented
System information
Type | Version/Name |
---|---|
Distribution Name | CentOS |
Distribution Version | 7.9.2009 |
Kernel Version | 3.10.0-957.5.1.el7.x86_64 |
Architecture | x86_64 |
OpenZFS Version | 2.2.4-1 |
Describe the problem you're observing
After some time of running system I start observe a lot of zombie processes. backup client, smbd etc...
In dmesg log I found:
[2217957.766237] INFO: task bpbkar:8838 blocked for more than 120 seconds.
[2217957.766314] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[2217957.766378] bpbkar D ffff95ec37265140 0 8838 1 0x00000080
[2217957.766384] Call Trace:
[2217957.766643] [<ffffffffc0927914>] ? multilist_insert+0xa4/0xc0 [zfs]
[2217957.766652] [<ffffffffb0567c49>] schedule+0x29/0x70
[2217957.766671] [<ffffffffc06b5325>] cv_wait_common+0x125/0x150 [spl]
[2217957.766679] [<ffffffffafec2d40>] ? wake_up_atomic_t+0x30/0x30
[2217957.766685] [<ffffffffc06b5365>] __cv_wait+0x15/0x20 [spl]
[2217957.766784] [<ffffffffc09aeb13>] zfs_rangelock_enter_impl+0x143/0x5c0 [zfs]
[2217957.766857] [<ffffffffc09ae7bd>] ? zfs_rangelock_exit+0x1dd/0x2a0 [zfs]
[2217957.766928] [<ffffffffc09aefa1>] zfs_rangelock_enter+0x11/0x20 [zfs]
[2217957.767000] [<ffffffffc09b2d9a>] zfs_read+0xca/0x3e0 [zfs]
[2217957.767093] [<ffffffffc09f36f8>] zpl_aio_read+0x108/0x1c0 [zfs]
[2217957.767114] [<ffffffffb0040a83>] do_sync_read+0x93/0xe0
[2217957.767119] [<ffffffffb00414bf>] vfs_read+0x9f/0x170
[2217957.767122] [<ffffffffb004237f>] SyS_read+0x7f/0xf0
[2217957.767129] [<ffffffffb0574d21>] ? system_call_after_swapgs+0xae/0x146
[2217957.767132] [<ffffffffb0574ddb>] system_call_fastpath+0x22/0x27
[2217957.767135] [<ffffffffb0574d21>] ? system_call_after_swapgs+0xae/0x146
I can check what file cause hang by ls -l /proc/8838/fd
.
If I try to ls
or cat
file process will also hang.
If I mount snapshot file is readable.
zpool iostat
, zpool iostat -w
always show exact same numbers
Describe how to reproduce the problem
I don't know what cause it.
Just run CentOS 7.9 with 3.10.0-957.5.1.el7.x86_64. ZFS was builded from sources as rpm packages.
Snapshots created and purged hourly and daily.