ikwzm / udmabuf

User space mappable dma buffer device driver for Linux.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kernel Oops when synching

polygon opened this issue · comments

I just started testing this project with a UltraScale+ processor running in ARM64 mode. I am trying to use this with a reserved-memory space. When I try to change the synchronization state, I am receiving a kernel oops. My device-tree looks as follows:

  reserved-memory {
    #address-cells = <2>;
    #size-cells = <2>;
    ranges;

    dma_region: dmaregion@7bf00000 {
      compatible = "shared-dma-pool";
      reg = <0x0 0x7bf00000 0x0 0x4000000>;
      no-map;
    };
  };

  dma: dma@7bf00000 {
    compatible = "ikwzm,udmabuf-0.10.a";
    device-name = "dma_region";
    minor-number = <0>;
    memory-region = <&dma_region>;
    size = <0x4000000>;
  };

When I load the module on the target system, everything looks fine:

# modprobe udmabuf
[  178.772612] udmabuf dma@7bf00000: assigned reserved memory node dmaregion@7bf00000
[  178.798115] udmabuf dma_region: major number   = 246
[  178.803007] udmabuf dma_region: minor number   = 0
[  178.807784] udmabuf dma_region: phys address   = 0x000000007bf00000
[  178.814080] udmabuf dma_region: buffer size    = 67108864
[  178.819416] udmabuf dma_region: dma coherent   = 0
[  178.824215] udmabuf dma@7bf00000: driver installed.

# cd /sys/class/udmabuf/dma_region/
# ls
debug_vma        power            sync_for_cpu     sync_owne
dev              size             sync_for_device  sync_size
device           subsystem        sync_mode        uevent
phys_addr        sync_direction   sync_offset

# cat phys_addr
0x000000007bf00000

However, when I try to write sync_for_cpu or sync_for_device, I am receiving an oops:

# echo 1 > sync_for_device
[  490.115828] Unable to handle kernel paging request at virtual address ffffffc07bf00000
[  490.123730] Mem abort info:
[  490.126490]   Exception class = DABT (current EL), IL = 32 bits
[  490.132392]   SET = 0, FnV = 0
[  490.135432]   EA = 0, S1PTW = 0
[  490.138551] Data abort info:
[  490.141417]   ISV = 0, ISS = 0x00000147
[  490.145235]   CM = 1, WnR = 1
[  490.148182] swapper pgtable: 4k pages, 39-bit VAs, pgd = ffffff8009c76000
[  490.154970] [ffffffc07bf00000] *pgd=000000007fff6003, *pud=000000007fff6003, *pmd=000000007fff5003, *pte=0000000000000000
[  490.165910] Internal error: Oops: 96000147 [#1] PREEMPT SMP
[  490.171444] Modules linked in: udmabuf(O) zdma_uio(O)
[  490.176482] CPU: 0 PID: 1451 Comm: sh Tainted: G           O    4.14.0-ultrazed #8
[  490.183946] Hardware name: xlnx,zynqmp (DT)
[  490.188108] task: ffffffc07865cd00 task.stack: ffffff800c210000
[  490.194019] PC is at __clean_dcache_area_poc+0x20/0x38
[  490.199135] LR is at __swiotlb_sync_single_for_device+0x50/0x60
[  490.205036] pc : [<ffffff800808e91c>] lr : [<ffffff800808d15c>] pstate: 80000145
[  490.212416] sp : ffffff800c213cf0
[  490.215709] x29: ffffff800c213cf0 x28: ffffffc07865cd00
[  490.221006] x27: ffffff8008441000 x26: 0000000000000040
[  490.226301] x25: 0000000000000124 x24: 0000000000000015
[  490.231596] x23: ffffff800c213eb8 x22: 0000000000000000
[  490.236891] x21: 0000000004000000 x20: ffffffc078090c10
[  490.242185] x19: 000000007bf00000 x18: 0000000000000844
[  490.247480] x17: 0000007fbb41a128 x16: ffffff80081450b8
[  490.252775] x15: 0000000000000008 x14: 0000007fbb36cdc0
[  490.258070] x13: 00000000004b5474 x12: 0101010101010101
[  490.263365] x11: 0000000000000000 x10: 0101010101010101
[  490.268660] x9 : ffffff800c213d58 x8 : ffffffc001573280
[  490.273954] x7 : 0000000004000000 x6 : 000000007bf00000
[  490.279249] x5 : ffffffc078090c10 x4 : 0000000000000001
[  490.284544] x3 : 000000000000003f x2 : 0000000000000040
[  490.289839] x1 : ffffffc07ff00000 x0 : ffffffc07bf00000
[  490.295135] Process sh (pid: 1451, stack limit = 0xffffff800c210000)
[  490.301472] Call trace:
[  490.303899] Exception stack(0xffffff800c213bb0 to 0xffffff800c213cf0)
[  490.310327] 3ba0:                                   ffffffc07bf00000 ffffffc07ff00000
[  490.318143] 3bc0: 0000000000000040 000000000000003f 0000000000000001 ffffffc078090c10
[  490.325955] 3be0: 000000007bf00000 0000000004000000 ffffffc001573280 ffffff800c213d58
[  490.333767] 3c00: 0101010101010101 0000000000000000 0101010101010101 00000000004b5474
[  490.341579] 3c20: 0000007fbb36cdc0 0000000000000008 ffffff80081450b8 0000007fbb41a128
[  490.349391] 3c40: 0000000000000844 000000007bf00000 ffffffc078090c10 0000000004000000
[  490.357203] 3c60: 0000000000000000 ffffff800c213eb8 0000000000000015 0000000000000124
[  490.365015] 3c80: 0000000000000040 ffffff8008441000 ffffffc07865cd00 ffffff800c213cf0
[  490.372827] 3ca0: ffffff800808d15c ffffff800c213cf0 ffffff800808e91c 0000000080000145
[  490.380639] 3cc0: ffffff800c213ce0 ffffff8008435668 0000008000000000 ffffff800823e998
[  490.388450] 3ce0: ffffff800c213cf0 ffffff800808e91c
[  490.393309] [<ffffff800808e91c>] __clean_dcache_area_poc+0x20/0x38
[  490.399480] [<ffffff8000448890>] udmabuf_set_sync_for_device+0xa0/0xe8 [udmabuf]
[  490.406854] [<ffffff8008294af8>] dev_attr_store+0x18/0x28
[  490.412232] [<ffffff800819cb78>] sysfs_kf_write+0x38/0x50
[  490.417612] [<ffffff800819bcc4>] kernfs_fop_write+0x11c/0x184
[  490.423343] [<ffffff8008144c4c>] __vfs_write+0x1c/0xf8
[  490.428462] [<ffffff8008144ee4>] vfs_write+0xac/0x160
[  490.433496] [<ffffff80081450fc>] SyS_write+0x44/0x88
[  490.438443] Exception stack(0xffffff800c213ec0 to 0xffffff800c214000)
[  490.444869] 3ec0: 0000000000000001 00000000004f9260 0000000000000002 0000007fbb4ad000
[  490.452684] 3ee0: 0000000000650031 0000000000000000 0080008080808080 7f7f7f7f7f7f7f7f
[  490.460496] 3f00: 0000000000000040 fffffffffffffff0 0101010101010101 0000000000000000
[  490.468308] 3f20: 0101010101010101 00000000004b5474 0000007fbb36cdc0 0000000000000008
[  490.476120] 3f40: 00000000004f35d8 0000007fbb41a128 0000000000000844 0000000000000001
[  490.483932] 3f60: 00000000004f9260 0000000000000002 00000000004f4000 00000000004f9260
[  490.491744] 3f80: 0000000000000020 00000000004f4000 0000000000000000 00000000004f5688
[  490.499556] 3fa0: 0000000000000000 0000007fdc3d2260 000000000040dcac 0000007fdc3d2260
[  490.507368] 3fc0: 0000007fbb41a150 0000000080000000 0000000000000001 0000000000000040
[  490.515180] 3fe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  490.522993] [<ffffff8008082c70>] el0_svc_naked+0x24/0x28
[  490.528285] Code: 9ac32042 8b010001 d1000443 8a230000 (d50b7a20)
[  490.534360] ---[ end trace 7646d5a8a18a8b56 ]---

Am I doing something fundamentally wrong?

Thank you for the issue.
I reproduced this issue with the following composition.

In either case, it was possible to reproduce that Kernel becomes Panic.

However, there was a slight difference between ARM64 + Linux 4.9.0 and ARM + Linux 4.14.34.

In case of ARM64 + Linux 4.9.0, insmod udmabuf.ko fails unless no-map is specified.

        reserved-memory {
                #address-cells = <2>;
                #size-cells = <2>;
                ranges;
                dma_region: dmaregion@7bf00000 {
                        compatible = "shared-dma-pool";
                        reg = <0x0 0x7bf00000 0x0 0x04000000>;
                };
        };
        dma:dma@7bf00000 {
                compatible = "ikwzm,udmabuf-0.10.a";
                device-name = "dma_region";
                minor-number = <0>;
                memory-region = <&dma_region>;
                size = <0x04000000>;
        };
root@debian-fpga:~# insmod udmabuf
[  215.085469] memremap attempted on ram 0x000000007bf00000 size: 0x4000000
[  215.092123] ------------[ cut here ]------------
[  215.096695] WARNING: CPU: 2 PID: 3530 at kernel/memremap.c:111 memremap+0x14c/0x158
[  215.104321] Modules linked in: udmabuf(O+) fclkcfg(O) uio_pdrv_genirq
[  215.110743]
[  215.112224] CPU: 2 PID: 3530 Comm: insmod Tainted: G           O    4.9.0-xlnx-v2017.3-zynqmp-fpga #2
[  215.121420] Hardware name: ZynqMP UltraZed-EG IO Carrier Card (DT)
[  215.127584] task: ffffffc066bb7000 task.stack: ffffffc068f54000
[  215.133488] PC is at memremap+0x14c/0x158
[  215.137480] LR is at memremap+0x14c/0x158
[  215.141472] pc : [<ffffff800812e484>] lr : [<ffffff800812e484>] pstate: 00000145
[  215.148849] sp : ffffffc068f57950
[  215.152148] x29: ffffffc068f57950 x28: ffffff8009564000
[  215.157442] x27: 0000000000000001 x26: ffffffc068f57a18
[  215.162737] x25: 000000007bf00000 x24: 000000007bf00000
[  215.168032] x23: ffffffc066abfb80 x22: ffffff8008d37ee8
[  215.173327] x21: 0000000004000000 x20: 0000000004000000
[  215.178621] x19: 0000000000000004 x18: 0000000000000010
[  215.183916] x17: 0000000000000000 x16: 00000000ffffffff
[  215.189211] x15: ffffff8088cf88d7 x14: 0000000000000006
[  215.194506] x13: ffffff8008cf88e5 x12: 0000000000000007
[  215.199801] x11: 0000000000000006 x10: 000000000000015a
[  215.205096] x9 : 000000000000004c x8 : 3030347830203a65
[  215.210391] x7 : 7a69732030303030 x6 : ffffff8008cf8923
[  215.215685] x5 : 0000000000000000 x4 : 0000000000000000
[  215.220980] x3 : 0000000000000000 x2 : ffffffc07ffa76b8
[  215.226275] x1 : 0000004077373000 x0 : 000000000000003c
[  215.231569]
[  215.233046] ---[ end trace 7d3980cdbaae651f ]---
[  215.237647] Call trace:
[  215.240080] Exception stack(0xffffffc068f57780 to 0xffffffc068f578b0)
[  215.246503] 7780: 0000000000000004 0000008000000000 ffffffc068f57950 ffffff800812e484
[  215.254315] 77a0: ffffff8008cfacf8 000000000000003c ffffffc068f577d0 ffffff80080d8804
[  215.262127] 77c0: ffffff8008cf8368 ffffff8008ab82d8 ffffffc068f57870 ffffff80080d8b18
[  215.269939] 77e0: 0000000000000004 0000000004000000 0000000004000000 ffffff8008d37ee8
[  215.277751] 7800: ffffffc066abfb80 000000007bf00000 000000007bf00000 ffffffc068f57a18
[  215.285563] 7820: 000000000000003c 0000004077373000 ffffffc07ffa76b8 0000000000000000
[  215.293375] 7840: 0000000000000000 0000000000000000 ffffff8008cf8923 7a69732030303030
[  215.301187] 7860: 3030347830203a65 000000000000004c 000000000000015a 0000000000000006
[  215.308999] 7880: 0000000000000007 ffffff8008cf88e5 0000000000000006 ffffff8088cf88d7
[  215.316810] 78a0: 00000000ffffffff 0000000000000000
[  215.321672] [<ffffff800812e484>] memremap+0x14c/0x158
[  215.326710] [<ffffff8008502e84>] dma_init_coherent_memory+0x5c/0x190
[  215.333043] [<ffffff800850345c>] rmem_dma_device_init+0x64/0x98
[  215.338948] [<ffffff80086ce7dc>] of_reserved_mem_device_init_by_idx+0x104/0x1a8
[  215.346250] [<ffffff8000959a70>] udmabuf_platform_driver_probe+0x248/0x898 [udmabuf]
[  215.353963] [<ffffff80084ee150>] platform_drv_probe+0x50/0xb8
[  215.359693] [<ffffff80084ec614>] driver_probe_device+0x1fc/0x2a8
[  215.365680] [<ffffff80084ec76c>] __driver_attach+0xac/0xb0
[  215.371150] [<ffffff80084ea630>] bus_for_each_dev+0x60/0xa0
[  215.376704] [<ffffff80084ebe00>] driver_attach+0x20/0x28
[  215.381999] [<ffffff80084eba00>] bus_add_driver+0x1d0/0x238
[  215.387554] [<ffffff80084ecf40>] driver_register+0x60/0xf8
[  215.393022] [<ffffff80084ee08c>] __platform_driver_register+0x44/0x50
[  215.399454] [<ffffff800095f19c>] udmabuf_module_init+0x19c/0x1000 [udmabuf]
[  215.406391] [<ffffff80080830b8>] do_one_initcall+0x38/0x128
[  215.411945] [<ffffff800812eb5c>] do_init_module+0x5c/0x1c8
[  215.417416] [<ffffff8008104e00>] load_module+0x1c68/0x2030
[  215.422883] [<ffffff8008105448>] SyS_finit_module+0xd8/0xe8
[  215.428437] [<ffffff8008082ef0>] el0_svc_naked+0x24/0x28
[  215.433756] Reserved memory: failed to init DMA memory pool at 0x000000007bf00000, size 64 MiB
[  215.469341] udmabuf dma_region: major number   = 243
[  215.474237] udmabuf dma_region: minor number   = 0
[  215.479017] udmabuf dma_region: phys address   = 0x000000006c000000
[  215.485256] udmabuf dma_region: buffer size    = 67108864
[  215.490641] udmabuf dma_region: dma coherent   = 0
[  215.495412] udmabuf dma@7bf00000: driver installed.

However, for ARM + Linux 4.14.34 insmod udmabuf.ko succeeded if reusable was specified instead of no-map.

       reserved-memory {
                #address-cells = <0x1>;
                #size-cells = <0x1>;
                ranges;
                image_buf@0 {
                        compatible = "shared-dma-pool";
                        reg = <0x20000000 0x01000000>;
                        reusable;
                        phandle = <0x27>;
                };
        };
        udmabuf@0 {
                compatible = "ikwzm,udmabuf-0.10.a";
                device-name = "udmabuf0";
                minor-number = <0x0>;
                size = <0x00F00000>;
                memory-region = <0x27>;
        };
root@debian-fpga:~# insmod udmabuf
[   26.969844] udmabuf udmabuf@0: assigned reserved memory node image_buf@0
[   27.020800] udmabuf udmabuf0: major number   = 245
[   27.025514] udmabuf udmabuf0: minor number   = 0
[   27.030114] udmabuf udmabuf0: phys address   = 0x20000000
[   27.035549] udmabuf udmabuf0: buffer size    = 15728640
[   27.040745] udmabuf udmabuf0: dma coherent   = 0
[   27.045303] udmabuf udmabuf@0: driver installed.

Also, sync_for_device will also terminate normally.

root@debian-fpga:~# echo 1 > /sys/class/udmabuf/udmabuf0/sync_for_device
root@debian-fpga:~#

I do not seem to understand the mechanism of reserved-memory yet. I am not confident that I can solve this issue.
Until this issue is resolved, ARM64 recommends not using reserved-memory.

This is follow-up report.

When using reserved-memory of Linux Kernel from a DMA device, you need to specify either no-map or reuseable for device-tree.

In case of no-map, the DMA memory pool mechanism in driver/base/dma-coherent.c is used.
In case of reuseable, the CMA memory pool mechanism in driver/base/dma-contiguous.c is used.

The Kernel Panic pointed out in this issue occurs with the no-map(that is, the DMA memory pool mechanism). I do not know the cause of Kernel Panic yet.

If you use udmabuf with reserved-memory, specify reusable and use the CMA memory pool mechanism.

When using the CMA memory pool mechanism, please set the device-tree as follows.

        reserved-memory {
                #address-cells = <2>;
                #size-cells = <2>;
                ranges;
                dma_region: dmaregion@7c000000 {
                        compatible = "shared-dma-pool";
                        reg = <0x0 0x7C000000 0x0 0x04000000>;
                        reusable;
                };
        };
        dma:dma@7c000000 {
                compatible = "ikwzm,udmabuf-0.10.a";
                device-name = "dma_region";
                minor-number = <0>;
                memory-region = <&dma_region>;
                size = <0x04000000>;
        };

When using the CMA memory pool mechanism, in addition to specifying reusable, care must be taken for address and size alignment.