ezyang / compact

Compact regions library for Haskell

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Could `compact` be made to happen transparently at the next GC?

jberryman opened this issue · comments

This is maybe OT since it's more of a question, and obviously entails changes to GHC, but I thought I would just ask here.

The way I understand compact regions to work is:

  • compact x copies x to a compact region immediately (like a manual GC)
  • references to the x will keep alive the original un-copied data in the normal heap

What I'm wondering is if the copying from step 1 couldn't be deferred until the next GC (at which time it would be copied into the compact region rather than into the older generation, say). This would make compact free I would think, and it could become some kind of pure annotation (like seq) and the second issue would be resolved as well (I would think).

It's absolutely possible. But there are some technical difficulties:

  1. You still have to deepseq the structure, because you're not going to be able to do evaluation when GC is running (the mutators are locked out.) This is pretty disappointing, because a fast implementation of compact could operate by simultaneously evaluating and shunting the allocations directly into the compact region (this is sort of hard to do in practice so it is not implemented.)
  2. Suppose you "defer" such a copy to GC time. The easiest way to represent this on heap is as a tagged pointer to the object to copy. When you "hit" it via GC, you switch into depth first traversal and move things over. But what if some subgraph has already been evacuated; another GC thread got to it first? Well, now you're going to have to copy.

Anyway, it's a good idea, and someone should give it a try.

Another great primitive that I've pondered in the past is inCompact :: a -> a. This would cause the mutator to allocate the given value directly into the compact region. Unfortunately, this is very hard to do since you may run into thunks that should not belong in the region in the course of evaluation. I've thought a bit about how to make this work, but the solutions I've been able to think of are too complex to pull their weight.