ivmai / bdwgc

The Boehm-Demers-Weiser conservative C/C++ Garbage Collector (bdwgc, also known as bdw-gc, boehm-gc, libgc)

Home Page:https://www.hboehm.info/gc/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`un-mprotect failed` on Ubuntu with out-of-memory error

UltimatePea opened this issue · comments

I am not sure if this is a good place to ask. Some of my compiled programs that run on linux will throw un-mprotect failed error. This issue will not occur on windows or Mac. Is there any way to fix this?

Please give more info. Which configuration of libgc? which libgc version? (is it reproduced with master?)

Thanks for helping! I am using the default version apt install libgc-dev that is bundled with Ubuntu 22.04 LTS (configured by the default package management system).

The issue is hard to reproduce: it usually occurs if the program runs for > 20 minutes with continuous allocations. See for example, this log: https://github.com/yuyan-lang/yuyan/actions/runs/5813820793/job/15762191733 (notice the `un-mprotect failed along the last lines of log). If I shorten/simplify the program or even slightly change how a program is compiled, the issue usually disappears on subsequent runs.

It would be sometime before I check if it is reproduced with the latest master. I will follow up here with more details when I get a chance to check.

This issue never occurs on macOS. So I am also wondering if there is a way to circumvent this issue.

Please give me exact libgc version - it could be seen in /usr/include/gc/gc_version.h
Then I could check if there any related fix applied since that version.

My gc_version.h shows:

#define GC_TMP_VERSION_MAJOR 8
#define GC_TMP_VERSION_MINOR 0
#define GC_TMP_VERSION_MICRO 6 /* 8.0.6 */

Please give me exact libgc version - it could be seen in /usr/include/gc/gc_version.h Then I could check if there any related fix applied since that version.

I don't see any relevant fixes since v8.0.6, but you ideally please try to reproduce it on master or at least on https://github.com/ivmai/bdwgc/releases/tag/v8.0.10

The issue is hard to reproduce: it usually occurs if the program runs for > 20 minutes with continuous allocations. See for example, this log: https://github.com/yuyan-lang/yuyan/actions/runs/5813820793/job/15762191733 (notice the un-mprotect failed along the last lines of log). If I shorten/simplify the program or even slightly change how a program is compiled, the issue usually disappears on subsequent runs.

I also see GC Warning: Failed to expand heap by 60719476736 bytes line (which means someone tries to allocate ~60 GB).

I don't know the root cause of un-mprotect failure. Currently I could say only that it relates to GC incremental collection mode.

I speculate that the root issue of this problem is that repeated allocations of large blocks of memory (for over 30 seconds with nearly no other computation) will overflow some system limit.

I haven't had a chance to try the master yet, as the compiler is not set up to allow customizations of library paths. The workaround is to break the program into smaller chunks so the compiler would not generate codes that allocate memory that often.

I haven't had a chance to try the master yet.

I think you could try release-8_0 branch. It, at least contains all fixes applied to master.

I speculate that the root issue of this problem is that repeated allocations of large blocks of memory

Maybe. BTW Is COUNT_UNMAPPED_REGIONS defined (in gcconfig.h) when compiling libgc?

I think you could try release-8_0 branch. It, at least contains all fixes applied to master.

Thank you! I am going to try master.

Maybe. BTW Is COUNT_UNMAPPED_REGIONS defined (in gcconfig.h) when compiling libgc?

This string is currently not found in /usr/include/gc/gc_config_macros.h. I am going to try master with default configurations.

Compiling on latest master with static libraries and default options, I am getting

un-mprotect vdb failed at 0x7fad8b0b3000 (length 4096), errno= 12
un-mprotect vdb failed
fish: Job 1, './yy_bs --mode=worker --worker-…' terminated by signal SIGABRT (Abort)

Please let me know if you need any other information. This currently has become a blocking issue for me. I am happy to assist you by providing more information.

Disabling incremental collection seems to be a temporary fix.

My observation is that enabling incremental collection by calling GC_enable_incremental after GC_init, the program will use threads for parallel collection, but not calling it, the program won't be using additional threads for gc. This incurs a performancs penalty.

My observation is that enabling incremental collection by calling GC_enable_incremental after GC_init, the program will use threads for parallel collection, but not calling it, the program won't be using additional threads for gc. This incurs a performancs penalty.

This is a different issue, could you please create it.

Compiling on latest master with static libraries and default options, I am getting
un-mprotect vdb failed at 0x7fad8b0b3

I think I understand the root cause: limit on virtual memory regions (64K) in Linux. I need to think of the proper solution.

At the same time, in case if master, I don't understand why mprotect vdb is used but not SOFT_VDB implementation. I assume your are running Ubuntu on x86_64. - I've created another issue about it (#599).

This is a different issue, could you please create it.

Sure! I will confirm the issue and then create it.

Update: Confirmed that this is no true. GC threads are running in parallel with the main program.

I assume your are running Ubuntu on x86_64.

Yes.

un-mprotect failed

I think I understand the root cause: limit on virtual memory regions (64K) in Linux. I need to think of the proper solution.

This is similar to #324 and the solution (fix) should be similar (maintain GC_num_dirty_regions and unprotect whole region if GC_num_dirty_regions+GC_num_unmapped_regions reaches GC_UNMAPPED_REGIONS_SOFT_LIMIT).

Compiling on latest master with static libraries and default options, I am getting
un-mprotect vdb failed at 0x7fad8b0b3000 (length 4096), errno= 12

Please try on the latest master - the issue should be fixed.