morganstanley / hobbes

A language and an embedded JIT compiler

Home Page:http://hobbes.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory management with newArray

adam-antonik opened this issue · comments

If I run something like

g x = do{a = newArray(1000L) :: [double]; return ()}
f x = do{g(x); return f(x)}

f(0) :: ()

in hi, then it will happily sit in a tight loop, consuming all memory, until it gets killed. A slightly less silly example would have g doing a map over an array, which would internally return a new array. How can I do this sort of computation without leaking memory?

This is a good question.

If you're at the shell, you can run something like this:

> printMemoryPool()
                     name id allocated used wasted
------------------------- -- --------- ---- ------
                  scratch  0      32KB   0B     0B
global region @ 0x10b8e90  1      96KB  80B     0B

This shows you that there are basically two memory regions in use, the "global" region is where global variables go, so e.g.:

> xs = map(show,[0..100000])
> printMemoryPool()
                     name id allocated      used    wasted
------------------------- -- --------- --------- ---------
                  scratch  0      32KB        0B        0B
global region @ 0x10b8e90  1  2.7382MB 2.67041MB 31.9443KB

Whereas the "scratch" region is used for intermediate calculations that can be discarded at the end of an "evaluation cycle" (e.g. evaluating an expression at the shell):

> length(map(show,[0..100000]))
100001
> printMemoryPool()
                     name id allocated      used    wasted
------------------------- -- --------- --------- ---------
                  scratch  0       1MB        0B        0B
global region @ 0x10b8e90  1  2.7382MB 2.67041MB 31.9443KB

So here the "allocated" scratch space went up but its "used" count is reset to 0 and you could see that repeated evaluations (of the same expression) wouldn't add to the "allocated" count.

There are some secret functions that let you manipulate the region set, "unsafeClearMemoryPool" will reset the active region, "unsafeMakeMemRegion" will create a new one, "unsafeSetRegion" will set the active region. Obviously these are dangerous and error prone, not ideal, it's basically unchecked manual region-allocated memory management.

Where we've used hobbes, this approach has worked well, and especially in latency-sensitive programs we want to avoid unpredictable delays.

I think it would be interesting to consider region inference methods to check and automate the kind of memory management that we currently do. Maybe we could optionally include other methods of garbage collection, for applications that have different tradeoffs in mind.

Thanks, that makes sense. I can get what I wanted working in this manual style. It would be nice though to have a different memory management as a compile time option. I've had a little play with plugging in the Boehm gc, which I seem to have got working in single-threaded mode, still trying to work out how to get multiple threads to play nicely.

For some of these use-cases we have where we're sensitive to latency, an incremental garbage collector might be a good option.

It seems like there's some value in having the type system encode whether something allocates (a handy side-effect of region-annotated programs). I've had several conversations where "does this dynamically allocate memory?" comes up, and having the type system remove that ambiguity would be useful.

I'm sorry, but I need a bit more help on this

I have a list [exists E.((E * ()) -> () * E)], each of these closures uses a fair amount of memory. I'd like to evaluate them each in turn without being OOMed. Messing around with various ways of resetting memory pools invariable gives something along the lines of "FATAL ERROR: Unexpected variant match failure on 0x555c1cb00930 at stdin:14 ('either x d f = case x of |0:_=d,1:y=f(y)|')\n"
Do you have any hints as to how to structure this?

No apologies needed, memory management is a tricky problem (assuming you're not content to punt and use generic GC). It's a little like the contextless-function / closure issue, maybe there's a way we can do region inference up to a point where it can't be decided and then we'd at least have a principled way to punt to generic GC conservatively.

Anyway, when you see that error about variant match failures, it means that you've read bad memory (e.g. you reset a memory region but then read something out of it later). There are other ways you can read bad memory that don't crash (but could be worse if you e.g. read a bad price).

Assuming that your closures are allocated out of the default region, and they aren't correlated by memory in any way (you're not allocating something within one closure that a subsequent closure might be able to use later), then I would think that you can get away with just one extra scratch region that you swap in around the closure call (we actually use this trick pretty heavily internally).

That could look something like this:

$ cat q.hob
scratchR = unsafeMakeMemRegion("my scratch region")

applyEachInScratch cs = each(
  \c.do {
    r = unsafeSetRegion(scratchR);
    putStr("eval to: ");
    println(apply(c, 1));
    unsafeClearMemoryPool();
    unsafeSetRegion(r);
  },
  cs
)

makeClosureList a b c = [
  \x.just(x * a + 10),
  toClosure(\_.nothing),
  \x.just(x * x + b * x + 42),
  toClosure(\_.nothing),
  \x.just(sum([1..x]) / c),
  toClosure(\_.nothing)
]

$ hi -s q.hob
> applyEachInScratch(makeClosureList(10, 20, 30))
eval to: 20
eval to:
eval to: 63
eval to:
eval to: 0
eval to:

> 

I'm curious to know your thoughts.

I think that all of that awkward explicit shuffling around of memory regions might be something that could be automated. We could probably put memory regions in the type system (like we do for structured data files) and validate memory safety by type checking. That's a lot of hand waving without an actual implementation though.

Lifting regions to the type system seems like a very good idea. It may well help debugging the problem I have that using the applyEachInScratch still gives this FATAL ERROR. I've now got something roughly like

someData = loadData()
someKeys = loadKeys()
process key = func(someData, initSomeObject(), key)
closures = [\().process(k) | k <- someKeys]
applyEachInScratch(closures)

So the closures all operate on the same state, but they shouldn't be passing data from one to another. Something's clearly gone screwy with this though... I'll look at cutting bits out to see try to see what exactly breaks, now that I've got your scratch clearing routine.

Ahh, func had some off by one bug deep inside it, and this was only getting tickled with the memory rearrangement forced by clearing the memory pool. Thanks for the help here, having some confidence in the pool clearing made it much simpler to track down the issue.

Was it off-by-one array indexing? That's another place where some creative type system changes would be useful. Array indexing is unchecked right now (performance) but there are some ideas for constructing array indexes that could make this safer without a performance penalty.

Sort of, constructing and then filling an array, but it was one element too large. But yes, nothing that wouldn't have been caught by a decent dependent type system.