GC behaviour in sync execution

Question

GC behaviour in sync execution

pieh opened this issue 2 years ago · comments

Hi Kris, first of all huge thanks for this lib (and lmdb-store/lmdb-js).

This is not really a bug report or feature request, more of a question(s)

I've been debugging some unexpected (to me) interaction of WeakRef and sync execution. Primarly that as soon as soon as I use cache setting with lmdb-store (even with expirer: false) it seems like newly created WeakRefs (or dereferenced already existing ones) prevent target objects from being GCed within same tick (doesn't happen when cache is disabled, but I would like to avoid going down this route if at all possible).

During my checks I did find that unfortunately this behaviour of WeakRef is actually in the spec ( https://tc39.es/ecma262/#sec-weak-ref-objects ). In particular AddToKeptObjects / ClearKeptObjects are of importance here. I had quite a difficult time finding rationale for this and only thing I found was this comment https://bugs.chromium.org/p/v8/issues/detail?id=12145#c8

For more completeness, the whole point of the kept objects stuff is to disallow the antipattern of polling deref() in a loop to synchronously observe GC timing. If we convince ourselves that can't happen with whatever optimization we want to do it should be fine.

Which is quite frustrating (if it's only reason for that behaviour) as I (or this lib) really have no intention on abusing it in such way.

Just to illustrate how this can happen (at the debugger breakpoint I was getting heap snapshots which does trigger GC):

const { WeakLRUCache } = require("weak-lru-cache");

// expirer: false to just use WeakRef parts, otherwise
// LRU part would strongly retain objects in memory
let myCache = new WeakLRUCache({ expirer: false });

// just a debug helper to find those by `Constructor` in heap snapshots
class FindMeInHeapSnapshot {
  constructor(value) {
    this.value = value;
  }
}

async function test() {
  for (let i = 1; i <= 5; i++) {
    const somethingToWeaklyCache = new FindMeInHeapSnapshot(i);
    // after leaving scope there should be no strong reference to
    // `somethingToWeaklyCache` and triggering GC ideally can clean it up
    myCache.setValue(i, somethingToWeaklyCache);
  }

  debugger; // all `FindMeInHeapSnapshot` instances still in heap snapshot :(

  await new Promise((resolve) => setImmediate(resolve));

  debugger; // finally it was GCed
}

test();

So, finally my question(s):

Do you think there is any trick to workaround this (I really doubt it :( ). I really would like to keep things sync for speed at scale ( our usage is as built tool, so it really is not needed for us for our process to be "responsive" to incomming requests etc)
Any advice on how to best address it? Changing to async iterations seems like obvious one, but nodejs/node#31979 seems to suggest there is severe price (in terms of speed) to be paid by doing that. Solution in-between could be to allow microtasks to run every X iterations so it's batch-like. Anything else I could consider?
Do you think this "problem" deserves some documentation (just so next person hitting it doesn't go on goose chase as I did :) )

Kris Zyp · Answer 1 · Sun Jan 23 2022 02:12:14 GMT+0800 (China Standard Time)

I was actually unaware of this behavior, this is quite irritating that WeakRef behaves this way, as the documentation seems like it clearly indicates that users should be aware of the non-determinism of GC.

I can't think of any direct workaround for GC'ing these objects within an event turn. It is possible that this library could optionally have some type of limit on the number of entries, and manually remove the entries (I presume that once a WeakRef is no longer reachable, it could be collected synchronously). However, this certainly seems suboptimal since part of the goal of the weak cache is to ensure that only one instance of an object for a given key ever exists in memory at once, and that would no longer be guaranteed (but maybe if it is only being used for performance that would be ok).
I do think that adding some logic to occassionally (after a large number of iterations to avoid performance issues) wait for the next event turn. However, from my tests, it looked like this required a macro tick/event turn (setImmediate), and not a micro-tick (using await new Promise(resolve => queueMicrotask(resolve)) does not seem to be sufficient for clearing the weak references for me). This is also unfortunate because macro turns have a lot more overhead (slower) than micro turns.
It is also worth noting that lmdb-js tends to be conducive to async/multi-turn execution. You certainly can execute things synchronously in lmdb-js, but writes (puts/deletes) are most efficient when many can be batched and executed together in a large transaction in lmdb-js's write thread, which sends the completion signal in a future event turn.
Yes, that's a good suggestion.

Michal Piechowiak · Answer 2 · Sun Jan 23 2022 03:22:27 GMT+0800 (China Standard Time)

I was actually unaware of this behavior, this is quite irritating that WeakRef behaves this way, as the documentation seems like it clearly indicates that users should be aware of the non-determinism of GC.

Yeah, the thing is - I don't want my code to rely on any specific GC behaviour (as in use finalizers as something similar to deconstrucors etc), but I do kind of need to make sure potentially huge list of objects to be just possible to GC (on memory pressure etc). As the thing I'm working on strictly relate to just getting OOM situations.

(I presume that once a WeakRef is no longer reachable, it could be collected synchronously)

Nope - you can swap from cache.setValue in my example snippet to just new WeakRef without referencing just created WeakRef anywhere. And this is in line with spec itself - just creating new WeakRef or using .deref() result in same thing - target can't be GCed during that tick.

I do think that adding some logic to occassionally (after a large number of iterations to avoid performance issues) wait for the next event turn. However, from my tests, it looked like this required a macro tick/event turn (setImmediate), and not a micro-tick (using await new Promise(resolve => queueMicrotask(resolve)) does not seem to be sufficient for clearing the weak references for me). This is also unfortunate because macro turns have a lot more overhead (slower) than micro turns.

Yup, that's the current plan right now (hello if (i%X===0) await new Promise(r => setImmediate(r))). Somewhat difficult to determine good number of X for me, just because documents can vary in sizes considerably (not a strict schema), but that's something for me to figure out.

It is also worth noting that lmdb-js tends to be conducive to async/multi-turn execution. You certainly can execute things synchronously in lmdb-js, but writes (puts/deletes) are most efficient when many can be batched and executed together in a large transaction in lmdb-js's write thread, which sends the completion signal in a future event turn.

For me it's mostly about reads right now (cases like iterating on documents to create reverse lookup tables / indexes after documents were already written + I know what fields to index on as those are not hardcoded, but computed/discovered later so I can't generate indexes at the same time as I do my writes).

One thing I thought of that is also less than ideal, is to try to make my reads use "cached" WeakRef only if it's already there, but don't create new one in specific code paths where do I iterate on potentially huge collections. But this would require for Database.get (from my standpoint as consumer) to allow to specify that for this particular call (and not globally set for entire db) OR me disabling cache on lmdb and using weak-lru-cache on my own (but then I need to replicate a lot of things heh, so not ideal).

As I mentioned in general I do want to use cache as it's very useful and in most cases my reads are intertwined with other (most often async) processing. It's really mostly few code paths (like indexing) where we do currently sync iterations on huge collections. Also not using it, might actually cause breaking change for our users as Database.get(id) === Database.get(id) would no longer be true, which was behaviour that wasn't really documented or "guaranteed", but was true for so long that I bet at least some users rely on it.

In any case - thanks for very quick reply. I will think on it some more.

Kris Zyp · Answer 3 · Mon Jan 24 2022 08:49:46 GMT+0800 (China Standard Time)

Do you think there is any trick to workaround this

Actually, I have some potentially good news: V8 exposes a ClearKeptObjects() method, which does basically what you would expect, synchronously clears kept objects. And I could expose this in JS. And there is also a --harmony-weak-refs-with-cleanup-some that allows for forced/synchronous execution of the finalization registry callbacks. As a proof of concept, this (completely synchronous) function will successfully clear the weakref, return undefined from the deref() call, and execute the finalization registry callback, if run with node --harmony-weak-refs-with-cleanup-some --expose-gc and with clearKeptObjects defined:

(function() {
	let finalized
	let fr = new FinalizationRegistry(() => {
		finalized = true;
	});
	function makeRef() {
		let o = {hi:'hello'}
		fr.register(o, 'test')
		return new WeakRef(o);
	}
	let w = makeRef();
	clearKeptObjects();
	gc();
	console.log(!!w.deref());
	fr.cleanupSome();
	console.log({finalized})
})()

Ultimately, I think would allow you to periodically clear the kept objects and trigger finalization registry callbacks without ever needing to wait for event turns.

I inclined, at least initially, to just add clearKeptObjects to lmdb-js, so I don't have to create a native add-on just for this one function in weak-lru-cache. And there are a couple ways this could be implemented:

By default call clearKeptObjects & FinalizationRegistry.cleanupSome after a certain number of cache entry additions (maybe 8K or so), with options to change frequency (or disable).
Just expose this as a function, and let users decide if and when they want to call clearKeptObjects and/or FinalizationRegistry.cleanupSome.

Michal Piechowiak · Answer 4 · Mon Jan 24 2022 19:33:17 GMT+0800 (China Standard Time)

Wow, you did dive deep there - I really appreciate that!

I added some comments to just show how I understood your above snippet - is that correct? (can't really run this myself)

(function() {
	let finalized
	let fr = new FinalizationRegistry(() => {
		finalized = true;
	});
	function makeRef() {
		let o = {hi:'hello'}
		fr.register(o, 'test')
		return new WeakRef(o);
	}
	let w = makeRef();
	clearKeptObjects();
	gc(); // after GC our `{hi:'hello'}` is no longer in the heap, but finalizer wasn't/might not have been called (?)
	console.log(!!w.deref());
	fr.cleanupSome(); // this triggers finalizer (in `weak-lru-cache` cleanup `Map<string,WeakRef>` map from entries that no longer are in the heap)
	console.log({finalized})
})()

I inclined, at least initially, to just add clearKeptObjects to lmdb-js, so I don't have to create a native add-on just for this one function in weak-lru-cache. And there are a couple ways this could be implemented:

By default call clearKeptObjects & FinalizationRegistry.cleanupSome after a certain number of cache entry additions (maybe 8K or so), with options to change frequency (or disable).

Just expose this as a function, and let users decide if and when they want to call clearKeptObjects and/or FinalizationRegistry.cleanupSome.

This does sound perfect to me (too good to be true?). Selflishly I would appreciate exposing it (lmdb is more than fine for me as that's "top-level" lib I work with) as I am already aware of this WeakRef behavior and being able to make use of clearKeptObjects would be more than enough.

Some default calls (I guess with default settings like you proposed) do however make sense for consumers in general (as abstracting "low level" details like that is one of the reasons we do use libs at all :) ) as long as consumer has a chance to adjust setting (or straight disable and handle it themselves if clearKeptObjects is exposed).

I do wonder however if doing something like that (exposing clearKeptObjects() to js land) won't be deemed "antipatterns" and Node decide to make it impossible in the future (just going off that one comment about rationale I found) but at least for current versions there would be a way to do this without need to migrate to setImmediate every now and then ( that would at least limit the scope of refactors directly ahead of me )

Michal Piechowiak · Answer 5 · Mon Jan 24 2022 19:44:52 GMT+0800 (China Standard Time)

Oh and also - from my understanding clearKeptObjects() should be cheap as under the hood it should just clear internal Set/Array and nothing more. It doesn't trigger GC on its own, but rather allow for potential/conditional GC triggered on allocations by Node.js to just collect objects that at that point should not be strongly referenced either by application code or Node.js internals. Is that correct?

Kris Zyp · Answer 6 · Mon Jan 24 2022 20:59:17 GMT+0800 (China Standard Time)

I do wonder however if doing something like that (exposing clearKeptObjects() to js land) won't be deemed "antipatterns" and Node decide to make it impossible in the future

API changes are possible. However, this is actually part of the V8 api, not Node's directly, and Node's traditional add-on API is based on giving direct access to V8 objects and V8 itself. Using the NAN layer is encouraged, but enforcing that would be a major change that would break a lot of stuff.

It is certainly possible that there could be a other ramifications that I am not aware. The docs say you aren't supposed to do this, but I presume that's just to conform to the ES spec, which we are intentionally defying (because we disagree with it):

https://v8docs.nodesource.com/node-16.13/d5/dda/classv8_1_1_isolate.html#ac4fa777212eeb0550aae4f0a7d36e32d

should be cheap

Yeah, that's my expectation as well, but will run some quick benchmarks when I get a chance.

gc(); // after GC our {hi:'hello'} is no longer in the heap, but finalizer wasn't/might not have been called (?)
console.log(!!w.deref());
fr.cleanupSome(); // this triggers finalizer (in weak-lru-cache cleanup Map<string,WeakRef> map from entries that no longer are in the heap)

Yes, that's correct, finalizer callbacks are queued and not immediately executed during GC (which I believe can even happen from a separate thread). There is actually a TC39/ES proposal for cleanupSome, which is what V8 implements with the --harmony-weak-refs-with-cleanup-some flag:

https://github.com/tc39/proposal-cleanup-some

The proposal kinda makes it sound like this function should also clear kept objects, and this would be the one solution for our issue, but I didn't observe that in my testing, I still had to manually clear kept objects, and then this function just enabled running the finalizers. Of course it is possible that this API will change as well. Running the finalizer callbacks isn't strictly as essential as clearing the kept objects because presumably the kept objects are the biggest memory holder, and once kept objects are cleared and GC'ed it is feasible to occasionally poll weak references to find cleared references and remove the corresponding Map entries.

Kris Zyp · Answer 7 · Mon Jan 24 2022 21:42:21 GMT+0800 (China Standard Time)

should be cheap

Some quick tests show that not only is cheap (baseline time for calling clearKeptObjects is about 24ns per call), it actually significantly improves the performance of creating/tracking WeakRef's. On my computer the following code runs 5x faster (!!!) than if I comment out the clearKeptObjects (and I presume this is because expanding/allocating increasing large internal arrays to hold the "kept" objects has a lot of overhead that is avoided when we frequently clear them):

(function() {
  let start = performance.now();
  let w = []
  for (let i = 0; i < 1000000;i++) {
    w[i%100] = new WeakRef({hi: 'hello'})
    if (i%100 == 0)
      clearKeptObjects();
  }
  return performance.now() - start
})()

This takes about 110ms to execute (110ns per cycle), whereas without clearKeptObjects it takes about 500-600ms. Anyway, I will try to get this in lmdb-js and published soon for you to try out.

Vladimir Razuvaev · Answer 8 · Tue Jan 25 2022 07:28:57 GMT+0800 (China Standard Time)

I must say I enjoyed reading this thread! Thanks to both of you for exploring this stuff 👍 I was pretty sure that the only option here is to stick with periodic setImmediate ticks in the loop (and thus making the sync codepaths async). But this is great.

Kris Zyp · Answer 9 · Tue Jan 25 2022 07:49:11 GMT+0800 (China Standard Time)

Published lmdb-js v2.1.7 which now exports a clearKeptObjects() function. I haven't done anything else with any automated/periodic clearing, but this should give you access to the main function for manually clearing the kept objects periodically on your own. Good luck, and let me know if it works for you :).

Michal Piechowiak · Answer 10 · Wed Jan 26 2022 02:47:59 GMT+0800 (China Standard Time)

Would have reply earlier but I had to migrate from lmdb-store before actually trying so it took a while, but it worked as advertised :)

I managed to keep my completely sync iterations as they were before and it worked in env with purposefully limited memory. I put clearKeptObjects() calls directly in our existing iterator internals, so consumers don't even need to care about it (we do have a few) - gatsbyjs/gatsby@755d946 (it has some unrelated changes, main point is in iterable module, I will be checking for optimal way to do it because my generator there is adding extra layer, but this is good enough for me for now to get pass stages that were just OOMing before)

---edit
With my tests I use documents of ~2MB size to stress test it - hence I do disable expirer part completely at least for now, as with default setting it was still causing OOMs in my testing scenarios, but more likely I will actually play with config of it and not disable it completely

Kris Zyp · Answer 11 · Wed Jan 26 2022 03:37:31 GMT+0800 (China Standard Time)

but it worked as advertised :)

Great to hear.

With my tests I use documents of ~2MB size to stress test it - hence I do disable expirer part completely at least for now, as with default setting it was still causing OOMs in my testing scenarios

Actually, I think this is a bug/regression in lmdb-js. lmdb-js is supposed to compute an expiration priority based on entry size that should cause larger entries to expire from the LFRU expirer much quicker, but looks like the value isn't getting passed through. This should be an easy fix. That being said, probably still worthwhile to have a little smaller lru table/cache size with large entries/docs.