danburkert / memmap-rs

cross-platform Rust API for memory mapped IO

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Huge Pages

josephDunne opened this issue · comments

commented

What is required to get support for huge pages? Is it just a matter of passing the correct flags to mmap with some checks around alignment and size etc?

commented

Can this be done as part of #33?

Is it just a matter of passing the correct flags to mmap with some checks around alignment and size etc?

Yes, and on Linux you don't even need a flag.

Linux will automatically merge pages to a huge page when MADV_HUGEPAGE is set, and "The kernel will also allocate huge pages directly when the region is naturally aligned to the huge page size".

Windows can be passed the MEM_LARGE_PAGES flag when calling VirtualAlloc.

Shouldn't be too hard to add a flag to MmapOptions, but I'm note sure if there's any way to test this.

Linux will automatically merge pages to a huge page when MADV_HUGEPAGE is set, and "The kernel will also allocate huge pages directly when the region is naturally aligned to the huge page size".

Is MADV_HUGEPAGE set by default? Does what you're describing require that transparent huge pages not be disabled (/sys/kernel/mm/transparent_hugepage/enabled)?

I think the natural way to request huge pages on linux through memmap would be by passing the MAP_HUGETLB flag.

Windows can be passed the MEM_LARGE_PAGES flag when calling VirtualAlloc.

Yeah, we should be able to do this one #31 is implemented.

Is MADV_HUGEPAGE set by default?

Looks like it is for me (x86-64 Arch Linux, Linux 4.10.12):

~> cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

Does what you're describing require that transparent huge pages not be disabled?

Yes, it's based on transparent hugepage support.

I think the natural way to request huge pages on linux through memmap would be by passing the MAP_HUGETLB flag.

This requires preallocation of hugepages, something which I've never seen enabled by default.

~> cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
0
~> cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages 
0

Maybe we can try MAP_HUGETLB and fall back to madvise if it fails? (Of course, the size and alignment check are always needed)

We should at the very least provide a way to set MAP_HUGETLB when creating the memory map, since that pretty fundamentally has to be done with this library. To date we haven't exposed any sort of madvise wrapper, but I think it's probably territory this crate could expand in to. I'd like to get the crate refactor under control before we start introducing new APIs, though. If we do go the route of wrapping more virtual memory apis, mprotect would be a strong contender as well.

Is this something that can be added backward compatibly after 1.0?

Yes, I'm confident this can be done backwards compatibly.

This requires preallocation of hugepages, something which I've never seen enabled by default.

I have some experience with Oracle SGDB (one of the few application using hugepages).

If huge pages are needed, the amount of huge pages is always pre-allocated (via sysctl at boot time, or /proc/sys/vm/nr_hugepages), and transparent huge pages is always disabled (via kernel cmdline).

I tried to implement this, see https://github.com/tatref/memmap-rs/tree/hugepages
However, munmap is not working. I guess there is some issue with the alignements / offsets, even though the values seem fine

strace on the example I provided gives:

mmap(NULL, 10240, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7f1c58800000
munmap(0x7f1c58800000, 10240)           = -1 EINVAL (Invalid argument)

Ok that was a stupid issue: the size of the huge page was 2 kiB instead of 2MiB...
Now the example is functional, including drop.

hey! What the status of this?