plasma-umass / Mesh

A memory allocator that automatically reduces the memory footprint of C/C++ applications.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How does the page table translation work?

spin6lock opened this issue · comments

Hi, I'm very interesting in the Section 4.5.1 in paper. The translation from virtual memory address to physical memory address is done by OS with the help of TLB. How can Mesh replace it instead? I think the key is We exploit the fact that mmap lets the same offset in a file(corresponding to a physical span) be mapped to multiple addresses. But I just can't fill the gap. Any help would be appreciated. :)

you're right that this is directly handled by the OS -- we do a couple of things in a row that let us direct the OS to update the page tables for us:

  1. we create a large, sparse temporary "file" that isn't backed by an on-disk file (just memory): https://github.com/plasma-umass/Mesh/blob/master/src/meshable_arena.cc#L513
  2. we then create a 1:1/identity mapping between a chunk of the address space and the file: https://github.com/plasma-umass/Mesh/blob/master/src/meshable_arena.cc#L54
  3. when we want to mesh a span, we update the mapping to point that page (or set of continuous pages) at a different offset in the file (the offset of the span we are meshing into): https://github.com/plasma-umass/Mesh/blob/master/src/meshable_arena.cc#L458

Step 3 updates the page tables for meshing. Does that make sense?

Thanks a lot for your detail explanation! https://github.com/plasma-umass/Mesh/blob/master/src/meshable_arena.cc#L458 calls mmap to setup a new mapping starting at virtual address remove. And then initializes it with content start at _fd + keepOff_offset. According to Figure 1 in paper, both allocated obj in remove and keep should survive at the end. But how can the obj in remove survive after initialization?

@spin6lock that function is called after the "removespan" has been marked read-only (so no concurrent writes can happen while we are copying objects), and objects have been copied from the "remove" span into the "keep" span, which is how they survive.

@bobby-stripe thanks for your reply! I've build a minimal working example according to your instruction(https://github.com/spin6lock/mmap_experiment/blob/master/main.c#L112). But I can't find the similar copy code in mesh. Do you use the sendfile API to copy the objects?

woah, cool! Super exciting to see other implementations :)

GlobalHeap.meshAllSizeClasses is what controls meshing, including finding pairs of pages to mesh. GlobalHeap.meshLocked is what handles performing the meshing after 2 meshable spans have been found, including marking spans as read-only, and it calls MiniHeap.consume which iterates over the set bits in the MiniHeap's bitmap (which correspond to live allocations), copying over the live allocations from src to this/dst.

Hope that helps!

Great!I get it now! This issue can be closed now. Thanks again for your kind help @bobby-stripe ! 👍