open-iscsi / tcmu-runner

A daemon that handles the userspace side of the LIO TCM-User backstore.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

When I use ARM ubuntu 18.04 as the iscsi server, using initiator login, the kernel crashes

lnsyyj opened this issue · comments

kerner version: 4.19.125
tcmu-runner version: 1.5.2

[Mon Jun  1 09:16:22 2020] scsi host5: iSCSI Initiator over TCP/IP
[Mon Jun  1 09:16:22 2020] scsi host6: iSCSI Initiator over TCP/IP
[Mon Jun  1 09:16:22 2020] scsi host7: iSCSI Initiator over TCP/IP
[Mon Jun  1 09:16:22 2020] Unable to handle kernel paging request at virtual address ffff7e00008fa940
[Mon Jun  1 09:16:22 2020] Mem abort info:
[Mon Jun  1 09:16:22 2020]   ESR = 0x96000004
[Mon Jun  1 09:16:22 2020]   Exception class = DABT (current EL), IL = 32 bits
[Mon Jun  1 09:16:22 2020]   SET = 0, FnV = 0
[Mon Jun  1 09:16:22 2020]   EA = 0, S1PTW = 0
[Mon Jun  1 09:16:22 2020] Data abort info:
[Mon Jun  1 09:16:22 2020]   ISV = 0, ISS = 0x00000004
[Mon Jun  1 09:16:22 2020]   CM = 0, WnR = 0
[Mon Jun  1 09:16:22 2020] swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082681c2b
[Mon Jun  1 09:16:22 2020] [ffff7e00008fa940] pgd=0000000000000000
[Mon Jun  1 09:16:22 2020] Internal error: Oops: 96000004 [#1] SMP
[Mon Jun  1 09:16:22 2020] Modules linked in: target_core_pscsi target_core_file target_core_iblock iscsi_target_mod cfg80211 target_core_user uio target_core_mod openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas hid_generic usbhid hid ast ttm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops
[Mon Jun  1 09:16:22 2020]  cfbcopyarea fb bcache font crc32_ce crc64 drm megaraid_sas igb ixgbe ahci drm_panel_orientation_quirks libahci i2c_designware_platform i2c_designware_core i2c_algo_bit mdio i2c_core aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[Mon Jun  1 09:16:22 2020] Process iscsi_trx (pid: 7496, stack limit = 0x0000000010dd111a)
[Mon Jun  1 09:16:22 2020] CPU: 0 PID: 7496 Comm: iscsi_trx Not tainted 4.19.118-0419118-generic #202004230533
[Mon Jun  1 09:16:22 2020] Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
[Mon Jun  1 09:16:22 2020] pstate: 80400005 (Nzcv daif +PAN -UAO)
[Mon Jun  1 09:16:22 2020] pc : flush_dcache_page+0x18/0x40
[Mon Jun  1 09:16:22 2020] lr : is_ring_space_avail+0x68/0x2f8 [target_core_user]
[Mon Jun  1 09:16:22 2020] sp : ffff000015123a80
[Mon Jun  1 09:16:22 2020] x29: ffff000015123a80 x28: 0000000000000000 
[Mon Jun  1 09:16:22 2020] x27: 0000000000001000 x26: ffff000023ea5000 
[Mon Jun  1 09:16:22 2020] x25: ffffcfa25bbe08b8 x24: 0000000000000078 
[Mon Jun  1 09:16:22 2020] x23: ffff7e0000000000 x22: ffff000023ea5001 
[Mon Jun  1 09:16:22 2020] x21: ffffcfa24b79c000 x20: 0000000000000fff 
[Mon Jun  1 09:16:22 2020] x19: ffff7e00008fa940 x18: 0000000000000000 
[Mon Jun  1 09:16:22 2020] x17: 0000000000000000 x16: ffff2d047e709138 
[Mon Jun  1 09:16:22 2020] x15: 0000000000000000 x14: 0000000000000000 
[Mon Jun  1 09:16:22 2020] x13: 0000000000000000 x12: ffff2d047fbd0a40 
[Mon Jun  1 09:16:22 2020] x11: 0000000000000000 x10: 0000000000000030 
[Mon Jun  1 09:16:22 2020] x9 : 0000000000000000 x8 : ffffc9a254820a00 
[Mon Jun  1 09:16:22 2020] x7 : 00000000000013b0 x6 : 000000000000003f 
[Mon Jun  1 09:16:22 2020] x5 : 0000000000000040 x4 : ffffcfa25bbe08e8 
[Mon Jun  1 09:16:22 2020] x3 : 0000000000001000 x2 : 0000000000000078 
[Mon Jun  1 09:16:22 2020] x1 : ffffcfa25bbe08b8 x0 : ffff2d040bc88a18 
[Mon Jun  1 09:16:22 2020] Call trace:
[Mon Jun  1 09:16:22 2020]  flush_dcache_page+0x18/0x40
[Mon Jun  1 09:16:22 2020]  is_ring_space_avail+0x68/0x2f8 [target_core_user]
[Mon Jun  1 09:16:22 2020]  queue_cmd_ring+0x1f8/0x680 [target_core_user]
[Mon Jun  1 09:16:22 2020]  tcmu_queue_cmd+0xe4/0x158 [target_core_user]
[Mon Jun  1 09:16:22 2020]  __target_execute_cmd+0x30/0xf0 [target_core_mod]
[Mon Jun  1 09:16:22 2020]  target_execute_cmd+0x294/0x390 [target_core_mod]
[Mon Jun  1 09:16:22 2020]  transport_generic_new_cmd+0x1e8/0x358 [target_core_mod]
[Mon Jun  1 09:16:22 2020]  transport_handle_cdb_direct+0x50/0xb0 [target_core_mod]
[Mon Jun  1 09:16:22 2020]  iscsit_execute_cmd+0x2b4/0x350 [iscsi_target_mod]
[Mon Jun  1 09:16:22 2020]  iscsit_sequence_cmd+0xd8/0x1d8 [iscsi_target_mod]
[Mon Jun  1 09:16:22 2020]  iscsit_process_scsi_cmd+0xac/0xf8 [iscsi_target_mod]
[Mon Jun  1 09:16:22 2020]  iscsit_get_rx_pdu+0x404/0xd00 [iscsi_target_mod]
[Mon Jun  1 09:16:22 2020]  iscsi_target_rx_thread+0xb8/0x130 [iscsi_target_mod]
[Mon Jun  1 09:16:22 2020]  kthread+0x130/0x138
[Mon Jun  1 09:16:22 2020]  ret_from_fork+0x10/0x18
[Mon Jun  1 09:16:22 2020] Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260) 
[Mon Jun  1 09:16:22 2020] ---[ end trace 1e451c73f4266776 ]---
[Mon Jun  1 09:16:32 2020] ------------[ cut here ]------------
[Mon Jun  1 09:16:32 2020] WARNING: CPU: 0 PID: 7495 at kernel/kthread.c:391 __kthread_bind_mask+0xb0/0xb8
[Mon Jun  1 09:16:32 2020] Modules linked in: target_core_pscsi target_core_file target_core_iblock iscsi_target_mod cfg80211 target_core_user uio target_core_mod openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas hid_generic usbhid hid ast ttm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops
[Mon Jun  1 09:16:32 2020]  cfbcopyarea fb bcache font crc32_ce crc64 drm megaraid_sas igb ixgbe ahci drm_panel_orientation_quirks libahci i2c_designware_platform i2c_designware_core i2c_algo_bit mdio i2c_core aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[Mon Jun  1 09:16:32 2020] CPU: 0 PID: 7495 Comm: iscsi_ttx Tainted: G      D           4.19.118-0419118-generic #202004230533
[Mon Jun  1 09:16:32 2020] Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
[Mon Jun  1 09:16:32 2020] pstate: 40400005 (nZcv daif +PAN -UAO)
[Mon Jun  1 09:16:33 2020] pc : __kthread_bind_mask+0xb0/0xb8
[Mon Jun  1 09:16:33 2020] lr : __kthread_bind_mask+0xb0/0xb8
[Mon Jun  1 09:16:33 2020] sp : ffff00001511bca0
[Mon Jun  1 09:16:33 2020] x29: ffff00001511bca0 x28: 0000000000000000 
[Mon Jun  1 09:16:33 2020] x27: ffffcfa05e811838 x26: ffff2d047f4e0a18 
[Mon Jun  1 09:16:33 2020] x25: ffff2d040bf06750 x24: 0000000000000008 
[Mon Jun  1 09:16:33 2020] x23: ffff2d047fbc8708 x22: ffff2d040bef25d8 
[Mon Jun  1 09:16:33 2020] x21: ffff2d047f202368 x20: 0000000000000040 
[Mon Jun  1 09:16:33 2020] x19: ffffcfa2080ee3c0 x18: 0000000000000010 
[Mon Jun  1 09:16:33 2020] x17: 0000000000000000 x16: 0000000000000000 
[Mon Jun  1 09:16:33 2020] x15: ffffffffffffffff x14: ffff2d047fbc8708 
[Mon Jun  1 09:16:33 2020] x13: ffff2d04ffd878b7 x12: ffff2d047fd878bf 
[Mon Jun  1 09:16:33 2020] x11: ffff2d047fbed000 x10: ffff00001511b980 
[Mon Jun  1 09:16:33 2020] x9 : 00000000ffffffd0 x8 : ffff2d047ed8e3f8 
[Mon Jun  1 09:16:33 2020] x7 : 5d20657265682074 x6 : ffffc9a27ff33158 
[Mon Jun  1 09:16:33 2020] x5 : ffffc9a27ff33158 x4 : 0000000000000000 
[Mon Jun  1 09:16:33 2020] x3 : ffffc9a27ff3bf88 x2 : 5879f7f686901900 
[Mon Jun  1 09:16:33 2020] x1 : 0000000000000000 x0 : 0000000000000024 
[Mon Jun  1 09:16:33 2020] Call trace:
[Mon Jun  1 09:16:33 2020]  __kthread_bind_mask+0xb0/0xb8
[Mon Jun  1 09:16:33 2020]  kthread_unpark+0x8c/0x90
[Mon Jun  1 09:16:33 2020]  kthread_stop+0x60/0x198
[Mon Jun  1 09:16:33 2020]  iscsit_close_connection+0x428/0x998 [iscsi_target_mod]
[Mon Jun  1 09:16:33 2020]  iscsit_take_action_for_connection_exit+0xc0/0x190 [iscsi_target_mod]
[Mon Jun  1 09:16:33 2020]  iscsi_target_tx_thread+0x180/0x200 [iscsi_target_mod]
[Mon Jun  1 09:16:33 2020]  kthread+0x130/0x138
[Mon Jun  1 09:16:33 2020]  ret_from_fork+0x10/0x18
[Mon Jun  1 09:16:33 2020] ---[ end trace 1e451c73f4266777 ]---
[Mon Jun  1 09:16:37 2020]  connection2:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295230129, last ping 4295231424, now 4295232704
[Mon Jun  1 09:16:38 2020]  connection2:0: detected conn error (1022)
[Mon Jun  1 09:16:38 2020]  connection1:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295230126, last ping 4295231424, now 4295232708
[Mon Jun  1 09:16:38 2020]  connection1:0: detected conn error (1022)
[Mon Jun  1 09:16:38 2020]  connection3:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4295230144, last ping 4295230144, now 4295232712
[Mon Jun  1 09:16:38 2020]  connection3:0: detected conn error (1022)
[Mon Jun  1 09:16:55 2020] iSCSI Login timeout on Network Portal 192.168.1.10:3260
[Mon Jun  1 09:17:44 2020] scsi 6:0:0:0: timing out command, waited 82s
[Mon Jun  1 09:17:44 2020] scsi 7:0:0:0: timing out command, waited 82s
[Mon Jun  1 09:17:44 2020] scsi 5:0:0:0: timing out command, waited 82s
[Mon Jun  1 09:18:38 2020]  session3: session recovery timed out after 120 secs
[Mon Jun  1 09:18:38 2020]  session1: session recovery timed out after 120 secs
[Mon Jun  1 09:18:38 2020]  session2: session recovery timed out after 120 secs

I think you'd better check this and get help from kernel mail list, it's a bug in kernel/LIO.

Thanks a lot for you reponse. Do you have any idea about which version of kernel already can be fixed this issue? or which version you suggest to adopt then we could compile it by myself.

BTW, could you hit this 100% using 4.19.125 ? Have you tried the latest upstream kernel ?

Yes. every time when I tied to login the iscsi target with kernel 4.19.125, and I already tried the latest verion including 5.4.44 and 5.6.16. The result always is the same with error. So, I guess there is something wrong beyond the kernel only.

Yes. every time when I tied to login the iscsi target with kernel 4.19.125, and I already tried the latest verion including 5.4.44 and 5.6.16. The result always is the same with error. So, I guess there is something wrong beyond the kernel only.

So this should be a bug exists in the mainline kernel. I never get a chance to test this on ARM arch before, and it works well for me since v4.20 upstream code on X86.

You can ping the Mantainers in the kernel mail list to get help, recently I do not have much time on it.

@lxbsz Thank you very much for your reply. O(∩_∩)O

Can confirm this also on kernel 5.7 and 5.4.28. Could you send some link to mail list thread? Would like to follow this issue

@dmeyerholt
The problem should be in target_core_user
https://bugzilla.kernel.org/show_bug.cgi?id=208045

@BStroesser Thank you very much. I am preparing to try these paths and test them.

Hi @BStroesser , I tested these three patches. It works well and the bug has been fixed.
thank you very much!

Can also confirm success on kernel 5.7.2

Thank you both for testing.

Since patch 1 already made it into 5.8-rc1, I will resend / send patches 2 and 3 to linux-scsi and target-devel lists.

@Insyyj and @dmayerholt, may I add your "Tested-by" tags to the patches?

If so, please let me know your name and email address for the tags. If you want, send it directly to me using the email address retrievable from bugzilla.

@dmeyerholt
I used the patch of 4.19.118 for testing and it worked well.
At present, we have no kernel 5.7.2 environment.

@BStroesser
No problem, my email is lnsyyj@hotmail.com, call me JiangYu.
Do you want this bug address?
https://bugzilla.kernel.org/show_bug.cgi?id=208045

4.19.x seems important being a lts kernel. I could try those patches on 5.4.x (lts as well) but am not sure if I can spend time to do it. I know that 5.4.x was affected by this bug as well so chances are the patches of @BStroesser will work as well.

[Fri Jun 19 11:19:46 2020] iSCSI Initiator Node: iqn.1993-08.org.debian:01:b27579df472 is not authorized to access iSCSI target portal group: 1.
[Fri Jun 19 11:19:47 2020] iSCSI Login negotiation failed.
[Fri Jun 19 11:21:33 2020] scsi host5: iSCSI Initiator over TCP/IP
[Fri Jun 19 11:23:32 2020] scsi host5: iSCSI Initiator over TCP/IP
[Fri Jun 19 11:23:32 2020] scsi 5:0:0:0: Direct-Access     LIO-ORG  TCMU device      0002 PQ: 0 ANSI: 5
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: Attached scsi generic sg11 type 0
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: [sdk] 2147483648 512-byte logical blocks: (1.10 TB/1.00 TiB)
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: [sdk] Write Protect is off
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: [sdk] Mode Sense: 2f 00 00 00
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: [sdk] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: [sdk] Optimal transfer size 524288 bytes
[Fri Jun 19 11:23:32 2020] sd 5:0:0:0: [sdk] Attached SCSI disk
[Fri Jun 19 18:17:11 2020] EXT4-fs (sdk): mounted filesystem with ordered data mode. Opts: (null)
[Mon Jun 22 00:00:14 2020] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000
[Mon Jun 22 00:00:14 2020] Mem abort info:
[Mon Jun 22 00:00:14 2020]   ESR = 0x96000004
[Mon Jun 22 00:00:14 2020]   Exception class = DABT (current EL), IL = 32 bits
[Mon Jun 22 00:00:14 2020]   SET = 0, FnV = 0
[Mon Jun 22 00:00:14 2020]   EA = 0, S1PTW = 0
[Mon Jun 22 00:00:14 2020] Data abort info:
[Mon Jun 22 00:00:14 2020]   ISV = 0, ISS = 0x00000004
[Mon Jun 22 00:00:14 2020]   CM = 0, WnR = 0
[Mon Jun 22 00:00:14 2020] user pgtable: 4k pages, 48-bit VAs, pgdp = 000000002c314fa8
[Mon Jun 22 00:00:14 2020] [0000000000000000] pgd=0000000000000000
[Mon Jun 22 00:00:14 2020] Internal error: Oops: 96000004 [#1] SMP
[Mon Jun 22 00:00:14 2020] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache target_core_pscsi target_core_file target_core_iblock iscsi_target_mod cfg80211 target_core_user uio target_core_mod openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nls_iso8859_1 ipmi_ssif joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas ast ttm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb
[Mon Jun 22 00:00:15 2020]  font drm ixgbe igb drm_panel_orientation_quirks i2c_algo_bit mdio bcache crc32_ce crc64 hid_generic i2c_designware_platform ahci i2c_designware_core megaraid_sas libahci i2c_core usbhid hid aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[Mon Jun 22 00:00:15 2020] Process tp_librbd (pid: 5707, stack limit = 0x00000000d11d520b)
[Mon Jun 22 00:00:15 2020] CPU: 52 PID: 5707 Comm: tp_librbd Not tainted 4.19.118-maxcn #1
[Mon Jun 22 00:00:15 2020] Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
[Mon Jun 22 00:00:15 2020] pstate: 20400005 (nzCv daif +PAN -UAO)
[Mon Jun 22 00:00:15 2020] pc : flush_dcache_page+0x18/0x40
[Mon Jun 22 00:00:15 2020] lr : tcmu_handle_completions+0xc4/0x4a0 [target_core_user]
[Mon Jun 22 00:00:15 2020] sp : ffff00002245bc10
[Mon Jun 22 00:00:15 2020] x29: ffff00002245bc10 x28: ffffcd70fb93d580 
[Mon Jun 22 00:00:15 2020] x27: ffff0000223c5000 x26: ffff363403dcf680 
[Mon Jun 22 00:00:15 2020] x25: ffffc771da0393e0 x24: 0000000000000000 
[Mon Jun 22 00:00:15 2020] x23: ffff000021bc5000 x22: ffffc771da038000 
[Mon Jun 22 00:00:15 2020] x21: ffff0000223c4fd0 x20: 00000000007fffd0 
[Mon Jun 22 00:00:15 2020] x19: 0000000000000000 x18: 0000000000000000 
[Mon Jun 22 00:00:15 2020] x17: 0000000000000000 x16: ffff36340eba9e00 
[Mon Jun 22 00:00:15 2020] x15: 0000000000000000 x14: 0000000000000000 
[Mon Jun 22 00:00:15 2020] x13: 0000000000000000 x12: 0000000000000000 
[Mon Jun 22 00:00:15 2020] x11: 0000000000000000 x10: 0000000000000000 
[Mon Jun 22 00:00:15 2020] x9 : 0000000000000000 x8 : 00000000000013e0 
[Mon Jun 22 00:00:15 2020] x7 : 0000000000000000 x6 : ffff00002245bcf8 
[Mon Jun 22 00:00:15 2020] x5 : ffff00002245bcf8 x4 : 0000000000000000 
[Mon Jun 22 00:00:15 2020] x3 : ffff36340fab0000 x2 : ffffb91200000000 
[Mon Jun 22 00:00:15 2020] x1 : 0000000000000000 x0 : ffff363403dc8f6c 
[Mon Jun 22 00:00:15 2020] Call trace:
[Mon Jun 22 00:00:15 2020]  flush_dcache_page+0x18/0x40
[Mon Jun 22 00:00:15 2020]  tcmu_handle_completions+0xc4/0x4a0 [target_core_user]
[Mon Jun 22 00:00:15 2020]  tcmu_irqcontrol+0x34/0x58 [target_core_user]
[Mon Jun 22 00:00:15 2020]  uio_write+0xb8/0x138 [uio]
[Mon Jun 22 00:00:15 2020]  __vfs_write+0x60/0x190
[Mon Jun 22 00:00:15 2020]  vfs_write+0xac/0x1b0
[Mon Jun 22 00:00:15 2020]  ksys_write+0x74/0xf0
[Mon Jun 22 00:00:15 2020]  __arm64_sys_write+0x24/0x30
[Mon Jun 22 00:00:15 2020]  el0_svc_common+0x88/0x180
[Mon Jun 22 00:00:15 2020]  el0_svc_handler+0x38/0x78
[Mon Jun 22 00:00:15 2020]  el0_svc+0x8/0xc
[Mon Jun 22 00:00:15 2020] Code: f9000bf3 aa0003f3 aa1e03e0 d503201f (f9400260) 
[Mon Jun 22 00:00:15 2020] ---[ end trace cdb72dbc3b2a8038 ]---
[Mon Jun 22 00:01:16 2020] ABORT_TASK: Found referenced iSCSI task_tag: 42
[Mon Jun 22 00:01:16 2020] ------------[ cut here ]------------
[Mon Jun 22 00:01:16 2020] WARNING: CPU: 7 PID: 746959 at kernel/workqueue.c:2919 __flush_work+0x260/0x290
[Mon Jun 22 00:01:16 2020] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache target_core_pscsi target_core_file target_core_iblock iscsi_target_mod cfg80211 target_core_user uio target_core_mod openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nls_iso8859_1 ipmi_ssif joydev input_leds ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ses enclosure scsi_transport_sas ast ttm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb
[Mon Jun 22 00:01:16 2020]  font drm ixgbe igb drm_panel_orientation_quirks i2c_algo_bit mdio bcache crc32_ce crc64 hid_generic i2c_designware_platform ahci i2c_designware_core megaraid_sas libahci i2c_core usbhid hid aes_neon_bs aes_neon_blk crypto_simd cryptd aes_arm64
[Mon Jun 22 00:01:16 2020] CPU: 7 PID: 746959 Comm: kworker/u128:0 Tainted: G      D           4.19.118-maxcn #1
[Mon Jun 22 00:01:16 2020] Hardware name: Greatwall QingTian DF720/F601, BIOS 601FBE20 Sep 26 2019
[Mon Jun 22 00:01:16 2020] Workqueue: tmr-user target_tmr_work [target_core_mod]
[Mon Jun 22 00:01:16 2020] pstate: 40400005 (nZcv daif +PAN -UAO)
[Mon Jun 22 00:01:16 2020] pc : __flush_work+0x260/0x290
[Mon Jun 22 00:01:16 2020] lr : __flush_work+0x260/0x290
[Mon Jun 22 00:01:16 2020] sp : ffff0000167cbbd0
[Mon Jun 22 00:01:16 2020] x29: ffff0000167cbbd0 x28: 0000000000000000 
[Mon Jun 22 00:01:16 2020] x27: ffffccef4ebc68b8 x26: ffff0000167cbce0 
[Mon Jun 22 00:01:16 2020] x25: ffff3634100b6a40 x24: 0000000000000000 
[Mon Jun 22 00:01:16 2020] x23: ffff363410252000 x22: 0000000000000000 
[Mon Jun 22 00:01:16 2020] x21: ffff363410098000 x20: ffffcd6fad1b1550 
[Mon Jun 22 00:01:16 2020] x19: ffffcd6fad1b1550 x18: ffffffffffffffff 
[Mon Jun 22 00:01:16 2020] x17: 0000000000000000 x16: 0000000000000000 
[Mon Jun 22 00:01:16 2020] x15: ffff363410098708 x14: ffff3634902578af 
[Mon Jun 22 00:01:16 2020] x13: ffff3634102578be x12: ffff3634100bd000 
[Mon Jun 22 00:01:16 2020] x11: 0000000005f5e0ff x10: ffff363410099168 
[Mon Jun 22 00:01:16 2020] x9 : ffff36340fc7d018 x8 : ffff36340f24d010 
[Mon Jun 22 00:01:16 2020] x7 : 5d20657265682074 x6 : ffffc771fffe2158 
[Mon Jun 22 00:01:16 2020] x5 : ffffc771fffe2158 x4 : 0000000000000000 
[Mon Jun 22 00:01:16 2020] x3 : ffffc771fffeaf88 x2 : 203f2659feac9b00 
[Mon Jun 22 00:01:16 2020] x1 : 0000000000000000 x0 : 0000000000000024 
[Mon Jun 22 00:01:16 2020] Call trace:
[Mon Jun 22 00:01:16 2020]  __flush_work+0x260/0x290
[Mon Jun 22 00:01:16 2020]  __cancel_work_timer+0x134/0x1a8
[Mon Jun 22 00:01:16 2020]  cancel_work_sync+0x24/0x30
[Mon Jun 22 00:01:16 2020]  core_tmr_abort_task+0xfc/0x1b0 [target_core_mod]
[Mon Jun 22 00:01:16 2020]  target_tmr_work+0x108/0x1d8 [target_core_mod]
[Mon Jun 22 00:01:16 2020]  process_one_work+0x1f0/0x428
[Mon Jun 22 00:01:16 2020]  worker_thread+0x44/0x488
[Mon Jun 22 00:01:16 2020]  kthread+0x134/0x138
[Mon Jun 22 00:01:16 2020]  ret_from_fork+0x10/0x18
[Mon Jun 22 00:01:16 2020] ---[ end trace cdb72dbc3b2a8039 ]---
[Mon Jun 22 00:01:31 2020]  connection3:0: detected conn error (1021)
[Mon Jun 22 00:01:31 2020]  connection3:0: detected conn error (1021)
[Mon Jun 22 00:01:48 2020] iSCSI Login timeout on Network Portal 192.168.1.201:3260
[Mon Jun 22 00:03:31 2020]  session3: session recovery timed out after 120 secs
[Mon Jun 22 00:03:31 2020] sd 5:0:0:0: Device offlined - not ready after error recovery
[Mon Jun 22 00:03:31 2020] sd 5:0:0:0: [sdk] tag#57 FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED driverbyte=DRIVER_OK
[Mon Jun 22 00:03:31 2020] sd 5:0:0:0: [sdk] tag#57 CDB: Unmap/Read sub-channel 42 00 00 00 00 00 00 00 18 00
[Mon Jun 22 00:03:31 2020] print_req_error: I/O error, dev sdk, sector 1727496192
[Mon Jun 22 00:03:31 2020] sd 5:0:0:0: rejecting I/O to offline device
[Mon Jun 22 00:03:31 2020] print_req_error: I/O error, dev sdk, sector 1727430656
[Mon Jun 22 00:03:31 2020] sd 5:0:0:0: rejecting I/O to offline device
[Mon Jun 22 00:03:31 2020] print_req_error: I/O error, dev sdk, sector 1727463424
[Mon Jun 22 10:07:20 2020] sd 5:0:0:0: rejecting I/O to offline device
[Mon Jun 22 10:07:20 2020] print_req_error: I/O error, dev sdk, sector 1074326448
[Mon Jun 22 10:07:20 2020] Aborting journal on device sdk-8.
[Mon Jun 22 10:07:20 2020] sd 5:0:0:0: rejecting I/O to offline device
[Mon Jun 22 10:07:20 2020] print_req_error: I/O error, dev sdk, sector 1074003968
[Mon Jun 22 10:07:21 2020] Buffer I/O error on dev sdk, logical block 134250496, lost sync page write
[Mon Jun 22 10:07:21 2020] JBD2: Error -5 detected when updating journal superblock for sdk-8.
[Mon Jun 22 10:25:43 2020] sd 5:0:0:0: rejecting I/O to offline device
[Mon Jun 22 10:25:43 2020] print_req_error: I/O error, dev sdk, sector 0
[Mon Jun 22 10:25:43 2020] Buffer I/O error on dev sdk, logical block 0, lost sync page write
[Mon Jun 22 10:25:43 2020] EXT4-fs (sdk): I/O error while writing superblock
[Mon Jun 22 10:25:43 2020] EXT4-fs error (device sdk): ext4_journal_check_start:61: Detected aborted journal
[Mon Jun 22 10:25:43 2020] EXT4-fs (sdk): Remounting filesystem read-only
[Mon Jun 22 10:25:43 2020] sd 5:0:0:0: rejecting I/O to offline device
[Mon Jun 22 10:25:43 2020] print_req_error: I/O error, dev sdk, sector 0
[Mon Jun 22 10:25:43 2020] Buffer I/O error on dev sdk, logical block 0, lost sync page write
[Mon Jun 22 10:25:43 2020] EXT4-fs (sdk): I/O error while writing superblock

Hi @dmeyerholt @BStroesser , after a few days of testing, I found another problem. When I use the initiator login target, and format the ext4 file system, and write some data, there will be an error as reported above.

Can confirm that too. Didn't notice initiator errors because of multipath.

I added a further patch for this issue to
https://bugzilla.kernel.org/show_bug.cgi?id=208045

Please test, I'm quite optimistic it will fix the problem (but as you know I have no ARM machine, so again it is compile tested only).

Thank you @BStroesser , I upgraded patch4 on kernel 4.19.118 and tested the read and write for a day. No problems encountered again.

I think like you, you can modify the kernel code, where can I find related books or materials?

Thank you for testing. I just sent the patch with your "Tested-by" to linux-scsi and target-devel.

Regarding books or materials: there are some general books and online infos regarding kernel and driver development. But IMHO best source for info is kernel source code ...

Thank you now running for 2 days without any issues

@BStroesser Thanks, I try it.
@dmeyerholt No problem found.