ballerina-platform / ballerina-lang

The Ballerina Programming Language

Home Page:https://ballerina.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug]: OOM when running stdlib level 8 and 6

heshanpadmasiri opened this issue · comments

Description

PRs to nutcracker branch fails when trying to run stdlib 8 and 6 due to OOM. Looking at the stack trace most likely culprit is insertAtomAtIndex which is triggered when trying to load BIR. Need to investigate (if that is the actual source of the error) whether this is caused by

  1. There is something wrong with how we serialize/de-serialize BIR causing us to fill the array untill we run out of memeory
  2. Some change to those standard library (adding more types) is actually creating "too many" types

Steps to Reproduce

No response

Affected Version(s)

No response

OS, DB, other environment details and versions

No response

Related area

-> Compilation

Related issue(s) (optional)

No response

Suggested label(s) (optional)

No response

Suggested assignee(s) (optional)

No response

So far I have observed fallowing problems related to memory with respect to BIR serialization logic

  1. Every rec atom we create is not actually defined in the BIR. We create various atoms as part of the various type checks (such a Cloneable type). When we serialize the package we don't serialize these atoms but we still level "holes" in the atoms indices (since they are determined by the number of rec atoms defined at that point). As a workaround for this I introduced the ability to "compact" the rec atom indices before serialization by removing nulls. Unfortunately this reduce memory consumption of packages separated by one level only. (One possible improvement would be to do an analysis on the package before serialization and compact the indices before serialization). I managed to fix out of memory issues in level 6 and 7 this way. This was fixed by "re-indexing" the rec atoms as well write the BIR. Basically for each given rec atom index we will calculate a new index by the order in which we see it when serializing the BIR. This should ensure the BIR there are no holes in RecAtoms (assuming we serialize only one package).
  2. Lets call BDDs with a single atom and left true, middle and and right false simple BDDs. Most BDDs are of this simple variety (These represent BDDs that are of single atomic type). However for each of these we spend the memory for 3 fields that hold pointers to left, right and middle (actual values are singletons in this case). Instead introduced a different subclass that only has the atom and waste no memory to store fields to left, right and middle
  3. All mapping and list atoms store their "members" as cells (in order tot properly represent mutability) which consumes considerable amount of memory.
  4. Different maps have fields with the same name, type and mutability combinations in most of the out of memory examples. Currently we create separate field objects for each (fixing 3 should reduce the overhead of this however)