ikwzm / udmabuf

User space mappable dma buffer device driver for Linux.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

4GiB udmabuf fails

rockybulwinkle opened this issue · comments

We have had success using u-dma-buf with a 512MiB address space. Now we're trying to use the full 4GiB of our device and are running into an issue when we load the kernel module. Below is our dmesg output. We are running u-dma-buf version 3.2.2

test@zynq:~$ sudo insmod u-dma-buf2.ko
[sudo] password for test:
[   63.855991] u_dma_buf: loading out-of-tree module taints kernel.
[   63.863216] u-dma-buf udmabuf@400000000: assigned reserved memory node image_buf@400000000
[   64.946316] dma_alloc_coherent(size=4294967296) failed. return(0)
[   64.952415] u-dma-buf udmabuf@400000000: driver setup failed. return=-12
[   64.959420] u-dma-buf udmabuf@400000000: driver installed.
[   64.964920] u-dma-buf: probe of udmabuf@400000000 failed with error -12
test@zynq:~$ uname -a
Linux zynq.tezzaron.com 5.4.0-xilinx-v2020.1 #1 SMP Wed Aug 5 17:06:26 UTC 2020 aarch64 GNU/Linux

Here is my device tree fragment for petalinux:

/include/ "system-conf.dtsi"
/ {
    memory {
        device_type = "memory";
        /*
        Notes:
        first 512MB is normal PS DRAM
        next 4GB at HPM0 is for udmabuf
        next 1GB is PS DRAM
        Final 2GB is the rest of PS Ram
        */
        reg = <0x0 0x0 0x0 0x20000000 
                0x4 0x00000000 0x1 0x00000000
                0x0 0x40000000 0x0 0x40000000
                0x8 0x0 0x0 0x80000000
            >;
    };
    reserved-memory {
            #address-cells = <2>;
            #size-cells = <2>;
            ranges;
            image_buf0: image_buf@400000000 {
                    compatible = "shared-dma-pool";
                    reusable;
                    reg = <0x4 0x00000000 0x1 0x00000000>;
                    label = "image_buf0";
            };
   };


    udmabuf@400000000 {
            #size-cells = <2>;
            compatible = "ikwzm,u-dma-buf";
            device-name = "udmabuf0";
            size = <0x1 0x0>;
            dma-mask = <64>;
            memory-region = <&image_buf0>;
    };

};

Do you see where this could be going wrong? Admittedly I jumped from 512MiB straight to 4GiB. I'm going to try a 1024MiB buffer size and report back.

Thanks again!

EDIT: corrected the previous sizes I tried

1GiB seems to be working fine. Here's the fragment for that. I think the most notable difference, besides the change to 1GiB instead of 512MiB, is I remove the #size-cells=<2>.

/include/ "system-conf.dtsi"
/ {
    memory {
        device_type = "memory";
        /*
        Notes:
        first 512MB is normal
        next 1GB at HPM0 is for u-dma-buf
        next 1GB is PS DRAM
        Final 2GB is the rest of PS Ram
        */
        reg = <0x0 0x0 0x0 0x20000000 
                0x4 0x00000000 0x0 0x40000000
                0x0 0x40000000 0x0 0x40000000
                0x8 0x0 0x0 0x80000000
            >;
    };
    reserved-memory {
            #address-cells = <2>;
            #size-cells = <2>;
            ranges;
            image_buf0: image_buf@400000000 {
                    compatible = "shared-dma-pool";
                    reusable;
                    reg = <0x4 0x00000000 0x0 0x40000000>;
                    label = "image_buf0";
            };
   };


    udmabuf@400000000 {
            compatible = "ikwzm,u-dma-buf";
            device-name = "udmabuf0";
            size = <0x40000000>;
            dma-mask = <64>;
            memory-region = <&image_buf0>;
    };

};

Thank you for the issue.

From the logs, it seems that dma_alloc_coherent() is trying to allocate a 4GiB buffer and is failing. dma_alloc_coherent() is a dma-mapping API inside the Linux Kernel, and if you guess it from the phenomenon that it succeeds at 1GiB and fails at 4GiB, it might be something of a Linux Kernel limitation.

Solving this problem is difficult. Because I can't have enough hardware to reserve 4GiB for reserved-memory.

Please let me know if you know anything else.

I'll let you know. In the meantime I'm going to see if I can work around the issue by creating multiple smaller regions. We're using this in conjunction with https://github.com/electrorys/smalloc to allocate for our programs. If I can create multiple regions, I'll just mmap them into adjacent virtual addresses for the smalloc pool.

Thanks again!

This technique seems to be working well enough, thankfully.

This technique seems to be working well enough, thankfully.

This technique seems to be working well enough, thankfully.