v0.4.5-beta.1: dmd runs out of memory

Question

v0.4.5-beta.1: dmd runs out of memory

veelo opened this issue 5 years ago · comments

Using v0.4.4, dmd used to consume just over 1 GB of memory compiling my project. With v0.4.5-beta.1, I can see dmd memory consumption rise to over 1.8 GB, after which dmd fails with a "Memory allocation failed" exception. Same result for dmd 2.089.0 as for dmd 2.090.1.

My parser is generated as a module, not at compile time.

This problem was likely introduced by #269.

skoppe commented 4 years ago

Great!

Bastiaan Veelo · Answer 1 · Wed Feb 26 2020 22:43:13 GMT+0800 (China Standard Time)

To see what the memory consumption is using a 64 bit compiler, I tested with ldc 1.20.0:

Using v0.4.4, compilation takes a maximum of 2 GB of memory. With v0.4.5-beta.1 the memory consumption maxes out at 10.7 GB, then successfully completes compilation.

That's a solid 5x in memory consumption during compilation, which also takes a lot longer.

Note that this has nothing to do with parsing or parser generation, the parser as a module is already generated, and we're not yet applying the parser. I guess something has changed inside the the templates in peg.d that makes compilation use considerably more resources.

skoppe · Answer 2 · Wed Feb 26 2020 23:18:20 GMT+0800 (China Standard Time)

I had expected a little increase here and there, but this is rather much. I'll need to have another look at it.

Bastiaan Veelo · Answer 3 · Wed Feb 26 2020 23:22:40 GMT+0800 (China Standard Time)

To reproduce: dub build in pegged/examples/extended_pascal.

Bastiaan Veelo · Answer 4 · Wed Feb 26 2020 23:28:05 GMT+0800 (China Standard Time)

The Extended Pascal grammar is special in that it has left-recursion in it. But that mainly causes complications in parsing, not compilation, I think.

Michael Fadely · Answer 5 · Fri Feb 28 2020 03:10:12 GMT+0800 (China Standard Time)

Does this still occur if you use command-line argument -lowmem? It was introduced in 2.086.0.

Bastiaan Veelo · Answer 6 · Fri Feb 28 2020 17:12:57 GMT+0800 (China Standard Time)

The -lowmem flag helps ldc to stay under 4.6 GB, which is a considerable improvement. But it does not help enough to make dmd stay under its 32bit memory limitation, sadly.

Tested on Windows by adding

      "dflags": ["-lowmem"],

to pegged/examples/extended_pascal/dub.json.

Bastiaan Veelo · Answer 7 · Wed Mar 25 2020 18:24:07 GMT+0800 (China Standard Time)

dub test is also failing with dmd running out of memory since #269. I feel the need for parking this in a separate branch. @skoppe or do you see a way to mitigate this?

skoppe · Answer 8 · Wed Mar 25 2020 20:50:33 GMT+0800 (China Standard Time)

Looking into it now.

skoppe · Answer 9 · Wed Mar 25 2020 21:24:07 GMT+0800 (China Standard Time)

The resulting binary is 10 times bigger, which suggests this is a template code bloat issue.

Bastiaan Veelo · Answer 10 · Wed Mar 25 2020 21:32:39 GMT+0800 (China Standard Time)

There has been talk about removing one level of templating #190 (comment). I don't know if that would be significant here.

Please also unit test, I don't know if #276 (comment) is related.

skoppe · Answer 11 · Wed Mar 25 2020 21:38:11 GMT+0800 (China Standard Time)

Yes, I can confirm the template bloat. 0.4.5-beta.1 has 93405 symbols whereas 0.4.4 has only 23032

Will figure out the largest offenders and get the count down, that should improve speed and memory consumption as well.

Luis Panadero Guardeño · Answer 12 · Wed Mar 25 2020 21:39:04 GMT+0800 (China Standard Time)

I understand. What I want to know is whether your failures are also caused by #269. So if you could git checkout af79adb9f (which is the commit before that merge) and do the test again, then I'll know what to revert.

I can try to config AppVeyour (Windows CI server) to test on Windows.

That would be awesome as well, but most important is to find the commit where the tests started failing.gi

I just did, and dub test ends fine. It's something on the merge pull request #269 , that breaks these test.

skoppe · Answer 13 · Wed Mar 25 2020 21:41:47 GMT+0800 (China Standard Time)

I remember running them and them succeeding. Regardless, I am looking into the mem consumption right now anyway, so I'll fix the tests afterwards as well.

Luis Panadero Guardeño · Answer 14 · Wed Mar 25 2020 22:01:05 GMT+0800 (China Standard Time)

I setup appveyor to run 'dub test' on Windows :

%DC% --version
DMD32 D Compiler v2.091.0-dirty
Copyright (C) 1999-2020 by The D Language Foundation, All Rights Reserved written by Walter Bright
ci.bat
Unit Tests
""
Generating test runner configuration '__test__default__' for 'default' (library).
Performing "unittest-cov" build using dmd for x86.
pegged 0.4.5-beta.1+commit.9.gdfdb765: building configuration "__test__default__"...
Error: more than 32767 symbols in object file
dmd failed with exit code 1.

skoppe · Answer 15 · Wed Mar 25 2020 22:02:12 GMT+0800 (China Standard Time)

I have got the symbols down to 23k, so that should no longer trigger. About to make PR.

Luis Panadero Guardeño · Answer 16 · Wed Mar 25 2020 22:05:49 GMT+0800 (China Standard Time)

@veelo Could do you remove from .travis.yml, this lines ? :

branches:
  only:
  - master

I forgot to remove, and without removing it, it would only be launched by commits/merges on master branch.

skoppe · Answer 17 · Wed Mar 25 2020 22:11:55 GMT+0800 (China Standard Time)

The unittest failure is a small one, about to fix.

Bastiaan Veelo · Answer 18 · Wed Mar 25 2020 22:14:57 GMT+0800 (China Standard Time)

@veelo Could do you remove from .travis.yml, this lines ? :

8ccbcf2

Bastiaan Veelo · Answer 19 · Thu Mar 26 2020 03:50:14 GMT+0800 (China Standard Time)

Thanks for your work on this! Unfortunately, it appears not to be enough for x86: https://ci.appveyor.com/project/PhilippeSigaud/pegged/build/job/nollc0299oow3crs

Reopening this for now.

Luis Panadero Guardeño · Answer 20 · Thu Mar 26 2020 06:15:04 GMT+0800 (China Standard Time)

Looks that only happens with dmd 2.090 . With 2.091 works fine . Could be really a bug on dmd 2.0.90 ?

List of all bug fixes and enhancements in D 2.091.0:
DMD Compiler regressions

    Bugzilla 10100: Identifiers with double underscores and allMembers
    Bugzilla 15812: static struct inside extern(C++) class cannot be used as key to associative array
    Bugzilla 16709: [Reg 2.068] Error: common.to at common.d conflicts with common.to at common.d
    Bugzilla 17098: Takes hours to -O compile then fails with Internal error: backend/cgreg.c 405
    Bugzilla 17145: [REG2.066.0] Tuple expansion error in local enum declaration

DMD Compiler bugs

    Bugzilla 9937: CTFE floats don't overflow correctly
    Bugzilla 11847: sub-pkg not available as qualified name
    Bugzilla 17257: Wrong recursive template destructor reflection
    Bugzilla 18147: Debug information limited in size
    Bugzilla 19479: Garbage .init in string mixins in static foreach in mixin templates
    Bugzilla 19504: pragma(mangle) ignored for C++ destructors
    Bugzilla 19515: POSIX,C++: Template argument pack wrongly mangled
    Bugzilla 20362: dmd fails to infer scope parameter for delegate
    Bugzilla 20375: std.typecons.RefCounted does not work with checkaction-context
    Bugzilla 20421: Exceptions don't work when linking through lld-link
    Bugzilla 20474: Deprecation warnings inside deprecated function template
    Bugzilla 20507: Debug statements affect inference of templated functions attributes
    Bugzilla 20514: obj-c info incorrectly placed in __objc_const section
    Bugzilla 20530: is(<...> == module/package) does not work with string mixins
    Bugzilla 20537: traits isPackage/isModule and is(package/module) fail on single level package.d import
    Bugzilla 20538: malformed enum definition compiles
    Bugzilla 20545: Segfault/Assertion failure when parsing invalid AA literal
    Bugzilla 20547: Wrong error message when trying to "new" an associative array
    Bugzilla 20551: In @safe code and using delegates, it's possible to escape references to function frame
    Bugzilla 20592: [GCC ASM] [ICE] dmd/iasmgcc.d(332): Assertion failure
    Bugzilla 20613: String switch in -betterC fails for 7+ labels

appveyor output:

Generating test runner configuration '__test__default__' for 'default' (library).
Performing "unittest-cov" build using dmd for x86.
pegged 0.4.5-beta.1+commit.17.gd3e2c3c: building configuration "__test__default__"...
---
ERROR: This is a compiler bug.
Please report it via https://issues.dlang.org/enter_bug.cgi
with, preferably, a reduced, reproducible example and the information below.
DustMite (https://github.com/CyberShadow/DustMite/wiki) can help with the reduction.
---
DMD v2.090.0-dirty
predefs   VibeCustomMain Have_pegged DigitalMars Windows CRuntime_DigitalMars CppRuntime_DigitalMars LittleEndian D_Version2 all D_InlineAsm D_InlineAsm_X86 X86 Win32 D_Coverage unittest assert D_ModuleInfo D_Exceptions D_TypeInfo D_HardFloat
binary    C:\dmd2\windows\bin\dmd.exe
version   v2.090.0-dirty
config    C:\dmd2\windows\bin\sc.ini
DFLAGS    -IC:\dmd2\windows\bin\..\..\src\phobos -IC:\dmd2\windows\bin\..\..\src\druntime\import
---
object.Error@(0): Access Violation
----------------
0x0065D21A
....

Bastiaan Veelo · Answer 21 · Thu Mar 26 2020 18:13:12 GMT+0800 (China Standard Time)

2.091 started shipping dmd in 64 bit version, which is capable of allocating more memory. On 32 bit Windows it obviously still runs 32 bit dmd, and also on low available memory this would be a problem. It could be that we just crossed the boundary with a moderate increase, and then so be it (that's why I removed the "bug" label), but if possible it would be very nice to not grow the memory footprint too much.
I'll try to do some measurements later today.

Luis Panadero Guardeño · Answer 22 · Fri Apr 03 2020 01:55:35 GMT+0800 (China Standard Time)

I try something... I go to v0.4.4 and try to build extended_pascal with this result :

$ dub build
Running pre-generate commands for parse_ep...
Performing "debug" build using /usr/bin/dmd for x86_64.
pegged ~master: building configuration "default"...
parse_ep ~master: building configuration "default"...
Linking...
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o:(.data.rel.ro+0x30): referencia a `_D8epparser12__ModuleInfoZ' sin definir
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o: en la función `_Dmain':
/home/luis/repos/Pegged/pegged/examples/extended_pascal/source/app.d:46: referencia a `_D8epparser__T9GenericEPTS6pegged3peg9ParseTreeZQBi2EP6opCallFAyaZQBp' sin definir
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o: en la función `_D3std8typecons__T5TupleTAyaTmZQn11__xopEqualsFKxSQBwQBv__TQBpTQBmTmZQBzKxQzZb':
source/app.d:(.text._D3std8typecons__T5TupleTAyaTmZQn11__xopEqualsFKxSQBwQBv__TQBpTQBmTmZQBzKxQzZb[_D3std8typecons__T5TupleTAyaTmZQn11__xopEqualsFKxSQBwQBv__TQBpTQBmTmZQBzKxQzZb]+0x2c): referencia a `_D3std8typecons__T5TupleTAyaTmZQn__T8opEqualsTxSQBuQBt__TQBnTQBkTmZQBxZQBjMxFNaNbNiNfxQBnZb' sin definir
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o: en la función `_D3std8typecons__T5TupleTAyaTmZQn8__xopCmpFKxSQBsQBr__TQBlTQBiTmZQBvKxQzZi':
source/app.d:(.text._D3std8typecons__T5TupleTAyaTmZQn8__xopCmpFKxSQBsQBr__TQBlTQBiTmZQBvKxQzZi[_D3std8typecons__T5TupleTAyaTmZQn8__xopCmpFKxSQBsQBr__TQBlTQBiTmZQBvKxQzZi]+0x2c): referencia a `_D3std8typecons__T5TupleTAyaTmZQn__T5opCmpTxSQBrQBq__TQBkTQBhTmZQBuZQBgMxFNaNbNiNfxQBnZi' sin definir
collect2: error: ld returned 1 exit status
Error: linker exited with status 1
/usr/bin/dmd failed with exit code 1.

Luis Panadero Guardeño · Answer 23 · Fri Apr 03 2020 01:58:12 GMT+0800 (China Standard Time)

WTF! I try a second time (with removing parse_ep and source/eppparse.d files) then doesn't fails!

 ~  r  P  p  e  extended_pascal  ➦ dc2a85b  dub build
Running pre-generate commands for parse_ep...
Performing "debug" build using /usr/bin/dmd for x86_64.
pegged ~master: target for configuration "default" is up to date.
parse_ep ~master: building configuration "default"...
Linking...
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o:(.data.rel.ro+0x30): referencia a `_D8epparser12__ModuleInfoZ' sin definir
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o: en la función `_Dmain':
/home/luis/repos/Pegged/pegged/examples/extended_pascal/source/app.d:46: referencia a `_D8epparser__T9GenericEPTS6pegged3peg9ParseTreeZQBi2EP6opCallFAyaZQBp' sin definir
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o: en la función `_D3std8typecons__T5TupleTAyaTmZQn11__xopEqualsFKxSQBwQBv__TQBpTQBmTmZQBzKxQzZb':
source/app.d:(.text._D3std8typecons__T5TupleTAyaTmZQn11__xopEqualsFKxSQBwQBv__TQBpTQBmTmZQBzKxQzZb[_D3std8typecons__T5TupleTAyaTmZQn11__xopEqualsFKxSQBwQBv__TQBpTQBmTmZQBzKxQzZb]+0x2c): referencia a `_D3std8typecons__T5TupleTAyaTmZQn__T8opEqualsTxSQBuQBt__TQBnTQBkTmZQBxZQBjMxFNaNbNiNfxQBnZb' sin definir
/usr/bin/ld: .dub/build/default-debug-linux.posix-x86_64-dmd_2091-386D33129E3812E779D86A1341ED2F19/parse_ep.o: en la función `_D3std8typecons__T5TupleTAyaTmZQn8__xopCmpFKxSQBsQBr__TQBlTQBiTmZQBvKxQzZi':
source/app.d:(.text._D3std8typecons__T5TupleTAyaTmZQn8__xopCmpFKxSQBsQBr__TQBlTQBiTmZQBvKxQzZi[_D3std8typecons__T5TupleTAyaTmZQn8__xopCmpFKxSQBsQBr__TQBlTQBiTmZQBvKxQzZi]+0x2c): referencia a `_D3std8typecons__T5TupleTAyaTmZQn__T5opCmpTxSQBrQBq__TQBkTQBhTmZQBuZQBgMxFNaNbNiNfxQBnZi' sin definir
collect2: error: ld returned 1 exit status
Error: linker exited with status 1
/usr/bin/dmd failed with exit code 1.
 ✗   ~  r  P  p  e  extended_pascal  ➦ dc2a85b  dub build
Running pre-generate commands for parse_ep...
Performing "debug" build using /usr/bin/dmd for x86_64.
pegged ~master: building configuration "default"...
parse_ep ~master: building configuration "default"...
Linking...

Luis Panadero Guardeño · Answer 24 · Fri Apr 03 2020 02:10:59 GMT+0800 (China Standard Time)

Ok ... so I did a runs with v0.4.4 v0.4.5-beta.1 and commit 21994c1

v0.4.4

First run :
Maximum resident set size (kbytes): 623008 -> ~608MiB
Second run:
Maximum resident set size (kbytes): 688736 -> ~672MiB

v0.4.5-beta.1

First run :
Maximum resident set size (kbytes): 3904464 -> ~3800MiB
Second run:
Maximum resident set size (kbytes): 4576480 -> ~4400MiB

Commit `21994c1` (just before merging #269. )

First run :
Maximum resident set size (kbytes): 623032 -> ~608MiB
Second run:
Maximum resident set size (kbytes): 688664 -> ~672MiB

So, yes... something went very wrong with the merge #269

skoppe · Answer 25 · Fri Apr 03 2020 02:25:00 GMT+0800 (China Standard Time)

Please try after #280 as well, as I fixed the memory issue there.

Luis Panadero Guardeño · Answer 26 · Fri Apr 03 2020 02:36:50 GMT+0800 (China Standard Time)

Yep! It's fixed :

First run :
Maximum resident set size (kbytes): 656104
Second run :
Maximum resident set size (kbytes): 725612

So, now discover why a test crash with dmd on windows

Bastiaan Veelo · Answer 27 · Fri Apr 03 2020 16:20:18 GMT+0800 (China Standard Time)

WTF! I try a second time (with removing parse_ep and source/eppparse.d files) then doesn't fails!

This is most likely due to dlang/dub#1474.

Bastiaan Veelo · Answer 28 · Fri Apr 03 2020 17:41:17 GMT+0800 (China Standard Time)

Testing the latest commit d5b3dd9 on Windows 10, dmd v2.091.0 32-bit with

dub test --compiler=\D\dmd2\windows\bin\dmd.exe

monitoring with Process Explorer, the Peak Working Set is still quite high: >1.947.000K. This is very close to the address space limit of 2GB for a 32-bit user-mode process on Windows. I suspect that dmd 2.090.1 requires a bit more, making it cross that limit.

Testing the same with 64 bit dmd:

dub test --compiler=\D\dmd2\windows\bin64\dmd.exe

the Peak Working Set is higher still: >3.040.000K.

Testing af79adb, just before merging #272, dmd 2.091.0 32-bit requires >1.314.000K. Still a lot, but comfortably under the limit. dmd 2.090.1 32-bit indeed requires slightly more memory for the same commit: >1.458.000K.

Keeping track of the longest match throughout the parse obviously requires more memory. The question I think is this: is an increase of 48% reasonable to expect? And if yes, is that generally affordable? Note that this is only of value for a failed parse, so that if the answer is no, it may still make sense to put this behind an option or something.

I think I'll tag a new beta release and then test it on my large grammar with large inputs, to see what the memory usage becomes.

Bastiaan Veelo · Answer 29 · Fri Apr 03 2020 17:58:11 GMT+0800 (China Standard Time)

Keeping track of the longest match throughout the parse obviously requires more memory. The question I think is this: is an increase of 48% reasonable to expect?

Sorry, I was confusing the compilation with the parsing. The parsing obviously requires more memory, but the failed allocation is during compilation.

Why is it that my numbers are so much higher than those measured by @Zardoz89?

Luis Panadero Guardeño · Answer 30 · Fri Apr 03 2020 18:09:16 GMT+0800 (China Standard Time)

Could be a issue of DMD on windows ? I asked on DLang forums, and looks that the least versions of DMD have some weird issues :
https://forum.dlang.org/thread/roejernakipyjpllwakr@forum.dlang.org

skoppe · Answer 31 · Fri Apr 03 2020 18:55:42 GMT+0800 (China Standard Time)

@veelo and @Zardoz89 Thanks for looking into this so much and spending the effort you do to get my changes in.

My changes in #269 don't allocate much more memory, but just retain more objects. Normally when parsing complete ParseTree's would have been rejected and collected by the GC, but now are retained for better diagnostics.

There is the general solution of having pegged use less memory (e.g. by reducing the size of the ParseTree struct, or replace the ~ used for array concatenation with Appenders).

Another one might be to discard the failed longest match when they no longer are the longest. This should allow the GC to free more memory while pegged is parsing. At the cost of a little cpu to revisit earlier nodes.

Bastiaan Veelo · Answer 32 · Sat Apr 04 2020 00:05:57 GMT+0800 (China Standard Time)

It keeps confusing me as well, but since it is dmd that runs out of memory during compilation of the unittests, before it runs them, I am not sure whether it is the additional memory needed for retaining the longest match that is the problem. It could be, I guess, if it involves some CT parsing.

I am almost tempted to disable all unittests but one, and switch them on individually to see which one is causing the problem. But I cannot devote that kind of time to this at the moment...

Luis Panadero Guardeño · Answer 33 · Sat Apr 04 2020 01:22:46 GMT+0800 (China Standard Time)

https://forum.dlang.org/post/r67qlg$19ui$1@digitalmars.com

The out-of-memory error happens because the build uses almost 2 GB of memory (from observing execution in the process explorer). Unfortunately, dmd versions 2.089.x and 2.090.x were no longer built with the LARGE_ADDRESS_AWARE bit set, which causes the OS to limit the 32-bit process to 2 GB for legacy reasons, instead of the theoretically possible 4 GB.

This bit was reenabled automatically when switching to LDC and the MS linker as host compiler for dmd 2.091.

So this explains why apparently only fails with 2.090.1 on Appveyor .

So, now why on Windows dmd ram usage is too big ? On linux this isn't happening.
I tweaked travis to execute dub build against extended_pascal with time -v to see the RAM usage. It keeps on 900MiB ... very far of the numbers that we see on Windows running only the unit tests.

https://travis-ci.com/github/PhilippeSigaud/Pegged/jobs/313698926

Maximum resident set size (kbytes): 907324

exoticus · Answer 34 · Sun Jun 28 2020 17:16:55 GMT+0800 (China Standard Time)

I am facing a much more severe issue, where compiling dgrammer.d takes ~8 minutes and uses 45gb of ram

skoppe · Answer 35 · Sun Jun 28 2020 17:18:04 GMT+0800 (China Standard Time)

What version of pegged do you use?

Per Nordlöw · Answer 36 · Tue Aug 04 2020 18:30:32 GMT+0800 (China Standard Time)

Seems like we really need newCTFE. I wonder if @UplinkCoder could maintain a feature newCTFE-branch of dmd until it, hopefully, gets merged some day.

Alternatively we could merge newCTFE and have it hidden behind a -preview=newCTFE flag.

Have anybody tried using newCTFE with Pegged and measured any reductions in time and memory reduction during generation of parser from peg-grammar?

I managed to at least merge newCTFE with todays dmd master, which seems promising.

Stefan Koch · Answer 37 · Tue Aug 04 2020 19:06:01 GMT+0800 (China Standard Time)

This one could also be template related. I need to measure.

…

On Tue, Aug 4, 2020, 12:30 PM Per Nordlöw ***@***.***> wrote: Seems like we really need newCTFE. I wonder if @UplinkCoder <https://github.com/UplinkCoder> could maintain a feature newCTFE-branch of dmd until it, hopefully, gets merged some day. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAVWSCFZKIHUJEMUCPEINTDR67PNNANCNFSM4K4FFXPQ> .

Per Nordlöw · Answer 38 · Tue Aug 04 2020 19:20:21 GMT+0800 (China Standard Time)

This one could also be template related.

I need to measure.

Great. Can you merge master into newCTFE and fix compilation errors while you are at it?

v0.4.5-beta.1: dmd runs out of memory

v0.4.4

v0.4.5-beta.1

Commit 21994c1 (just before merging #269. )

Commit `21994c1` (just before merging #269. )