no-defun-allowed / baker-gc

A loose interpretation of an incremental copying collector

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

2018-10-08 08:31 <no-defun-allowed> i’m going to create the world’s first conservative copying collector

2018 no-defun-allowed did in fact not create a conservative copying collector, nor would she have been even close to being first to do it. But 2020 no-defun-allowed has a conservative copying collector at least.

+This is a rough implementation of List Processing in Real Time on a Serial Computer , which can collect allocated conses incrementally, and somewhat conservatively (on the stack and registers). However, the client must use a read barrier (as implemented by the car and cdr functions provided.)+

Actually, no, it’s a replication copying collector. It copies everything reachable from the roots to newspace, then updates all the references in referenced conses. This only requires a write barrier to replicate modifications if there is a newspace copy of a cons.

This utterly abuses setjmp and how stacks appear to work on my machine, so by using this code, you agree to not sue me for limbs lost to it. One can trace roots (no pun intended) to SBCL’s semi-conservative copying collector and the Shenandoah JVM garbage collector, which are not likely to blow your limbs off. The stack walking code was also gleaned off Gilbert Baumann’s mark-sweep implementation.

How it works

./images/pages.png

Allocation

Conses are organised into pages (struct page of pages.h), with each page 16KiB wide. Each cons is a triple <forwarding pointer, car, cdr>. We allocate from the start of a page to its end, as typical for compacting collectors.

Scanning

At the start of a collection cycle, the stack and registers are scanned for roots.The registers are scanned by using setjmp to “spill” the registers into the stack, and the stack can be carefully probed for values that look like pointers into the heap.

Checking what looks like a pointer can be done better with some heuristics instead of iterating over pages most of the time; provided pages are allocated mostly contiguously, we can eliminate most pointers outside the heap by checking for presence in an interval holding all pages. Pages are also cached, so that most searching can be eliminated.

At the end of a collection cycle, the stack and registers are scanned for conses that must be pinned (as we cannot update the conservative roots), and have their slots fixed up, by copying them from the copy.

Barriers

Each object has a forwarding pointer, either to itself, or its copy in newspace. With replication copying, this pointer is only used when writing, to replicate modifications in oldspace to newspace. Checking pointer (physical) equality also requires retrieving the forwarding pointer to get canonical pointers to compare.

Scheduling

Garbage collections start after some number of pages, the threshold, have been allocated after the last collection finished. In order of priority, the threshold is chosen so that:

  • it is never below 10 pages (about 80kB),
  • stack scanning completes in less than 3 milliseconds (adjustable with PAUSE_TARGET), and
  • the heap remains between 0.5x and 1.3x its last size after a collection.

About

A loose interpretation of an incremental copying collector


Languages

Language:C 98.9%Language:Makefile 1.1%