Huge Pages
josephDunne opened this issue · comments
What is required to get support for huge pages? Is it just a matter of passing the correct flags to mmap with some checks around alignment and size etc?
Is it just a matter of passing the correct flags to mmap with some checks around alignment and size etc?
Yes, and on Linux you don't even need a flag.
Linux will automatically merge pages to a huge page when MADV_HUGEPAGE
is set, and "The kernel will also allocate huge pages directly when the region is naturally aligned to the huge page size".
Windows can be passed the MEM_LARGE_PAGES
flag when calling VirtualAlloc
.
Shouldn't be too hard to add a flag to MmapOptions
, but I'm note sure if there's any way to test this.
Linux will automatically merge pages to a huge page when MADV_HUGEPAGE is set, and "The kernel will also allocate huge pages directly when the region is naturally aligned to the huge page size".
Is MADV_HUGEPAGE
set by default? Does what you're describing require that transparent huge pages not be disabled (/sys/kernel/mm/transparent_hugepage/enabled)?
I think the natural way to request huge pages on linux through memmap
would be by passing the MAP_HUGETLB
flag.
Windows can be passed the MEM_LARGE_PAGES flag when calling VirtualAlloc.
Yeah, we should be able to do this one #31 is implemented.
Is MADV_HUGEPAGE set by default?
Looks like it is for me (x86-64 Arch Linux, Linux 4.10.12):
~> cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
Does what you're describing require that transparent huge pages not be disabled?
Yes, it's based on transparent hugepage support.
I think the natural way to request huge pages on linux through memmap would be by passing the MAP_HUGETLB flag.
This requires preallocation of hugepages, something which I've never seen enabled by default.
~> cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
0
~> cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
0
Maybe we can try MAP_HUGETLB
and fall back to madvise
if it fails? (Of course, the size and alignment check are always needed)
We should at the very least provide a way to set MAP_HUGETLB
when creating the memory map, since that pretty fundamentally has to be done with this library. To date we haven't exposed any sort of madvise
wrapper, but I think it's probably territory this crate could expand in to. I'd like to get the crate refactor under control before we start introducing new APIs, though. If we do go the route of wrapping more virtual memory apis, mprotect
would be a strong contender as well.
Is this something that can be added backward compatibly after 1.0?
Yes, I'm confident this can be done backwards compatibly.
This requires preallocation of hugepages, something which I've never seen enabled by default.
I have some experience with Oracle SGDB (one of the few application using hugepages).
If huge pages are needed, the amount of huge pages is always pre-allocated (via sysctl at boot time, or /proc/sys/vm/nr_hugepages), and transparent huge pages is always disabled (via kernel cmdline).
I tried to implement this, see https://github.com/tatref/memmap-rs/tree/hugepages
However, munmap is not working. I guess there is some issue with the alignements / offsets, even though the values seem fine
strace
on the example I provided gives:
mmap(NULL, 10240, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7f1c58800000
munmap(0x7f1c58800000, 10240) = -1 EINVAL (Invalid argument)
Ok that was a stupid issue: the size of the huge page was 2 kiB instead of 2MiB...
Now the example is functional, including drop.
hey! What the status of this?