Calling `cache.gc()` might cause memory usage increase and performance impact since v3.9

Question

Calling `cache.gc()` might cause memory usage increase and performance impact since v3.9

tgohn opened this issue 2 months ago · comments

Issue Description

Descriptions:

Since @apollo/client v3.9, calling client.cache.gc() increase memory usage in various cache subsequently.
Some affected cache are:

inMemoryCache. executeSelectionSet
inMemoryCache. executeSubSelectedArray
client.cache.addTypenameTransform.resultCache

This behaviour did NOT occur in v3.8 .

Also, I think there is performance impact here as well when trying to re-read previous gql query document, after calling cache.gc()
This happens because subsequent usage of cache.storeReader.executeSelectionSet (optimism.wrap) will result in cache miss.

Some other notes:

When doing git bisect on apollo-client repo, the first commit introducing this behaviour is bd26676

From reading the code, I think by reseting the cache in inMemoryCache.addTypenameTransform (source), it produces different cache key for the same gql query subsequently.

The chain is something like this:

cache.storeReader.executeSelectionSet#makeCacheKey uses selectionSet as part of its key
selectionSet is returned from QueryManager.transform()
Querymanager.transform() makes use of inMemoryCache.addTypenameToDocument (source)
inMemoryCache.addTypenameToDocument() returns different value for same documentNode after inMemoryCache.addTypenameTransform.resetCache() in inMemoryCache.gc()

Hope this make sense.

Link to Reproduction

https://codesandbox.io/p/devbox/apollo-3-9-memory-leak-public-xknp7y?file=%2Fsrc%2FApp.tsx&workspaceId=ws_UB3koDt7PAYGsj7fRJouXc

Reproduction Steps

Reproduction steps:

Click on "Toggle Show Location & cache.GC()" multiple times
- this button mounts/unmounts a simple React component that use useQuery() hooks (Code copied from get-started doc)
- this button also triggers apolloClient.cache.gc() at the same time
Observe the various cache size keeps increasing. For example:inMemoryCache.executeSelectionSet.size
- This behaviour started since v3.9.0
- in v3.8, theinMemoryCache.executeSelectionSet.size stays at 6

Reproduction screenshot:

`@apollo/client` version

3.12.11

Lenz Weber-Tronic · Answer 1 · Mon Feb 10 2025 16:42:07 GMT+0800 (China Standard Time)

Hmm.
It's expected that this will end up with a new identifier - but it's not expected that any references to the old one will stay in use for long, unless they're actively used - and in your reproduction, they are not actively used, so I have to admit that I'm a bit at a loss.

I'll have to dig deeper into this, but also appreciate any further insights you might come up with.

John Nguyen · Answer 2 · Mon Feb 10 2025 17:13:59 GMT+0800 (China Standard Time)

I would need to dig deeper into the code as well, to understand where the inMemoryCache.executeSelectionSet (optimism.wrap) cache was pruned.

But this could be related to timing issue where:

cache.gc() was called
useQuery hook subscription.current.unsubscribe() was deferred by setTimeout (source)
hence when subscription.current.unsubscribe() was called, it would result in cache miss when trying to delete old entry in optimism's strong cache (screenshot below)

Screenshot:

Lenz Weber-Tronic · Answer 3 · Mon Feb 10 2025 17:34:12 GMT+0800 (China Standard Time)

Okay, I've dug into this - this is actually not an unbounded memory leak, but expected behaviour.

First thing, if this is disturbing you, you can reset everything by calling

client.cache.gc({ resetResultCache: true });

which will also reset/recreate the executeSelectionSet cache.

But that said, this is kinda expected behaviour given what the executeSelectionSet cache is: it's a rotating cache with a maximum size.
Having this cache "grow" is something we don't consider a "memory leak" unless it would grow beyond it's maximum size.

You can see this cache like cache management in an operating system like Linux - when you look at memory usage, it will be constantly growing, but a lot of it is labelled as "cache", not "memory usage" - if memory usage reaches a certain threshold, that memory is collected, but before that it's held in case it might be useful in the future.

We don't actively remove data from it before it grows full.
We might change it to a "weak cache" implementation in the future, ~~but that would be a breaking change for environments where weak caches don't exist.~~ (which would be fine), but I believe I deliberately didn't do so in the future when faced the choice. I'll have to try hard to remember my reasoning from back then 😆
Either way, to track this and experiment with it, I've opened #12361 - which might or might not help here.

If you want to limit how big this cache can grow, I recommend you set a smaller cache size.

PS: I'm kinda curious - in what environment are you using this that memory size is of a concern to this level? Note that on servers you shouldn't have ApolloClient instances that exist for longer than one user request.

John Nguyen · Answer 4 · Mon Feb 10 2025 18:45:00 GMT+0800 (China Standard Time)

Thank you for the quick turn around.
And agree with the distinction between unbounded memory leak v.s. max cache size.

PS: I'm kinda curious - in what environment are you using this that memory size is of a concern to this level

For context:
We have a Single Page App that is:

using apollo-client v3.8.
we use cache.evict() follow by cache.gc() to manually remove un-needed data from apollo cache.

When we tried to upgrade the client to v3.9, we saw increase in browser's usedJSHeapSize and other UI perf metrics worsen.
Hence I was looking into what has changed between v3.8 and v3.9 in term of memory.

Also, historically, we has been configuring the apollo-client with resultCacheMaxSize: Infinity due legacy reasons 😬 :

workaround cache perf degradation when optimism was introduced last time. Similar to #7544
lack of product limits from our side that result in big customers having complex configuration (e.g. high number of entities). In some case, the result cache settled at around ~160k last time.

I will get back to the team with your suggestions and see how best to proceed forward in our case.

John Nguyen · Answer 5 · Wed Feb 26 2025 12:03:39 GMT+0800 (China Standard Time)

Hi @phryneas , my apologies for the late update to this thread.

After some more testings, here are some of my findings. I am putting them here, in case it helps others in future.
Please correct me if anything below is wrong.

I would need to dig deeper into the code as well, to understand where the inMemoryCache.executeSelectionSet (optimism.wrap) cache was pruned

In my case, this executeSelectionSet cache was pruned as part of calling client.cache.gc() after client.cache.evict().
There are some intricate wiring here between entityStore and executeSelectionSet that I do not fully grasp 😅 :

If you want to limit how big this cache can grow, I recommend you set a smaller cache size.

In our case, it is true that we can set limit the cache size (we will be doing this); however, there are still other inefficiencies affecting our usecase compared with v3.8 after calling cache.gc():

when re-running previous used graphql query, new cache entries for executeSelectionSet will be created. The pool size is used up quite fast, which might lead to extra time doing the LRU pruning later. For example:

useQuery(aBigQuery);
console.log(client.cache.storeReader.executeSelectionSet.size) // 30_000
client.cache.gc()
useQuery(aBigQuery)
console.log(client.cache.storeReader.executeSelectionSet.size) // 60_000

similarly, when rerunning previous used graphql query after cache.gc(), due the the cache key has changed (see first Issue comment), extra CPU needed to spent to re-execute the real execSelectionSetImpl . This has noticeable an impact to UI perf in our usecase, especially for large accounts.

Work around:
To work around this behaviour change in v3.9, I am planning to use a custom DocumentTransform for our apollo-client:

the custom DocumentTransform will do a simple pasthru of documentNode, but with a custom getCacheKey strategy
instead of using passed in documentNode as cache key (default behavior), it will use the extracted operation type and name instead (example: ["query", "user"] instead of [document])
- this works for our setup because we have linter rule @graphql-eslint/unique-operation-name to ensure each gql document has unique name
- see this codeSandbox for sample implementation

Some other notes:

This issue with cache key generation after calling cache.gc() (see the issue's first comment) only affect the React hook version useQuery(), not the normal client.readQuery() behaviour.
- See this codesandbox test file for simplified test scenario.
- it looks like client.readQuery does not depends on QueryManager (which uses inMemoryCache.addTypenameToDocument).
  And executeSelectionSet has its own "adding __typename to query" logic (source) , hence client.readQuery() result includes __typename field by default.
Since calling addTypenameTransform.resetCache() inside inMemoryCache.gc() does somewhat affect the result cache (for the React hook version). Would it be reasonable to only do so only if resetResultCache: true options was used ?
I am not sure here, the original context of bd26676 PR is quite sparse.

Hope this helps.

John Nguyen · Answer 6 · Wed Mar 19 2025 09:58:28 GMT+0800 (China Standard Time)

Hi @phryneas ,
Hope you are well.

Based on the comment above:

calling addTypenameTransform.resetCache() in cache.gc() effectively invalidate result-cache used by React hooks versions (for example useQuery)
- this makes subsequent React re-rendering of previous userQuery() calls take longer since storeReader can not find their previous cache entry in storeReader.executeSelectionSet
this issue affects React hook version, but not the API like client.readQuery()

What do you think if we only call addTypenameTransform.resetCache() in inMemoryCache.gc() only if { resetResultCache: true } options was used ?
This would make the expectation of about result cache consistent on both hook and API versions

Jerel Miller · Answer 7 · Thu Mar 20 2025 01:07:39 GMT+0800 (China Standard Time)

Hey @tgohn! Thats a good idea. I've made that change over in #12459, can you try the PR build and let us know if this helps?

npm i https://pkg.pr.new/@apollo/client@12459

John Nguyen · Answer 8 · Thu Mar 20 2025 10:45:16 GMT+0800 (China Standard Time)

Hi @jerelmiller ,
Thank you for the quick reply.

I modified the reproduction codesandbox to use your version above:

original reproduction: codesandbox
with #12459 applied: codesandbox

And the new version fixed the issue 🙌.
When clicking on "Toggle Show Location & cache.GC()" button multiple times, the inMemoryCache.executeSelectionSet.size printout stays the same now.

Jerel Miller · Answer 9 · Fri Mar 21 2025 00:44:25 GMT+0800 (China Standard Time)

@tgohn glad that helps! We've merged that PR but want to wait for a minor release to release it. This will be included in 3.14 when we get that out (probably some time in late April/early May as it will also contain all the deprecations and warnings to prepare for 4.0). If you absolutely need that change now before 3.14 or any of its prereleases are out, I'd recommend patch-package to apply that change to your codebase.

Since this has been merged, I'll go ahead and close this. Be on the lookout for 3.14 releases in the near future!

github-actions · Answer 10 · Fri Mar 21 2025 00:44:36 GMT+0800 (China Standard Time)

Do you have any feedback for the maintainers? Please tell us by taking a one-minute survey. Your responses will help us understand Apollo Client usage and allow us to serve you better.

Calling `cache.gc()` might cause memory usage increase and performance impact since v3.9

Issue Description

Descriptions:

Link to Reproduction

Reproduction Steps

@apollo/client version

`@apollo/client` version