mjansson / rpmalloc

Public domain cross platform lock free thread caching 16-byte aligned memory allocator implemented in C

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory usage seems large compared with alternatives and is unaffected by cache settings.

rjobling opened this issue · comments

commented

I'm trying to understand why using rpmalloc as a replacement for the standard windows allocator is causing my app to use so much more ram as per the commit size shown in the task manager.

I have an app that in my tests has a commit size of just over 17.5gb. This app is mostly making allocations from a single thread, there are roughly 1million allocations made before settling down, half of those are very small.

If I use the _aligned_malloc etc then the usage is down by 300mb compared to using rpmalloc.

I've tried turning off caching by setting ENABLE_THREAD_CACHE and ENABLE_GLOBAL_CACHE to 0. This has to my surprise caused the app to use significantly more ram, 900mb more with both caches off.

I've also tried reducing GLOBAL_CACHE_MULTIPLIER to 2 instead of 8. This has little impact.

Can you help me understand why the memory usage is larger and what if anything can or should be done to reduce it?

The current version does not decommit memory pages in larger partially used spans, which is why it tends to use more physical committed memory than standard lib allocators.

You could try the mjansson/rewrite branch which is more aggressive in decommitting pages and see if that fits your use case.

@mjansson the source code in the rewrite branch is around half the size (in LOC) of the one in the develop branch. Is it an older version, or a newer more compact version?

Newer, and some parts have not been ported over yet (like first class heaps)

@mjansson, would you recommend using the newer branch version right now?

The lower line count makes me really want to try, but I don't know how ready/stable it is.

I've been looking for a memory allocator for the past few days that is both fast and small. Unfortunately most performance-oriented allocators (mimalloc, snmalloc, jemalloc) tend to be rather too large.

Try the rewrite branch again, both with and without ENABLE_DECOMMIT and see how it performs now

The simple heuristics for ENABLE_DECOMMIT=1 should be reasonably good now, give it a try

Have you tried using the rewrite branch? Closing this for now, reopen if you are interested in continuing digging into this issue.