RFC: Consolidating document and normalised caching - safer caching

Question

RFC: Consolidating document and normalised caching - safer caching

JoviDeCroock opened this issue 7 months ago · comments

Summary

By default urql comes with a document-based cache and as you progress your application you might see a need arise for normalised caching. Migrating from one to the other currently isn't a great process as both caches have their failure points, quickly going over them...

In document based caching we will fail at keeping the cache fresh when

A mutation returns a scalar
A query returns null or an empty array (we fail to derive the __typename that will be invalidated by mutations)

In normalised caching we will fail at keeping the cache fresh when

A mutation returns a scalar
A mutation returns an entity that is not in the cache

Normalised caching in general potentially reduces the load on your infrastructure as we will be able to update the mutated entity in-memory when it's for instance an update. This can however come at a price, when we create or delete an entity we need to make a custom updater function as the cache doesn't know what to do (insert/remove it from a list, connect to it from Query.entity(id: x)) and automatic updates could lead to a list-ordering being off as the name got updated but we forgot to invalidate our list with this entity...

We want to avoid folks having to immediately face the learning curve of updaters, ... and instead allow them to gradually adopt and improve how the application works when they see the need arise.

Another benefit that comes to mind from consolidating these caches is that we can go normalised cache by default and hence enable a simplified implementation of useFragment, I think between the LSP and gql.tada I think we are quite ready to start supporting fragments as a first-class citizen. We might need to do some improvements to our heuristic matching or come up with a new primitive like Apollo has but apart from that I think we have a well-rounded approach.

Proposed Solution

I propose that we reduce the need for custom-updaters by having a default-updater which won't run when the user has defined their own updater.

When a scalar is returned for i.e. delete we throw away the cache
When an entity is returned which isn't present in the cache we remove lists where the type is present
Optional: upon updating an entity automatically provide the option to mark lists as stale

In doing so we make normalised caching safer and we can opt-in the user to optimisations with writing custom updaters and we remove a potential pain point of having to migrate cache later in the journey.

Do note that we still have the second limitation of the document-based cache here when an entity returns either null or an empty list we won't be able to invalidate. We could respect additionalTypenames here to tag the field but not sure if that's a smart thing to do...

Requirements

The normalised cache will most likely need some kind of tagging system to lookup entities by __typename and fields returning a list of a certain type. In doing so we could offer up a wider set of helpers to i.e. allow folks to just invalidate all entities of a certain type, ... This ties back to #2713

An example of such a system in graph-cache terms

{
  "Type:Todo": ["Todo:1", "Todo:2"],
  "List:Todo": ["Query.todos(take:10)"]
}

Now we're able to do things like invalidateList('Todo') or invalidateType(Todo) internally as a safety mechanism. An additional benefit here is that folks who repeatedly write cache.inspectFields to i.e. invalidate all lists when they create an entity will be freed from this burden as we can introduce getLists('Todo') 😅

Alternatives

An alternative to all of this could be a directives based approach where you can declaratively co-locate the logic for a mutation with your execution document i.e.

mutation {
  deleteTodo(id: $id) @invalidate("Todo", $id)
}

A benefit of the directives approach would be that you can create more general logic, which in-turn would allow you to reduce the initial overhead of loading your entry-point as less logic would be created on the cacheExchange({}).updates, I know that folks can just create these abstractions with GraphQL functions but imho it's easier to reason about this way as one looks at their executable document in isolation and can reason about which directive is needed to perform a certain action.