composablesys / collabs

Collabs library monorepo

Home Page:https://collabs.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Initial values

mweidner037 opened this issue · comments

We should permit setting initial values of Crdts in the constructor. Otherwise, the only option is to make the first creator of a Crdt perform operations to drive it to the initial value, but that gets confusing if multiple due so concurrently (Yjs issue: https://discuss.yjs.dev/t/initial-offline-value-of-a-shared-document/465)

I see two approaches:

  • (easy but dangerous) Provide a method (e.g. in Runtime) that takes a callback and runs the operations in the callback to drive some newly-constructed Crdt to its desired initial value. Each replica would call this immediately after constructing. It would perform the operations using some dummy replicaId without actually sending them.
  • (effort but safer) In each of our types, add constructor params that let you set the initial value (e.g. see CCounter). This requires modifying each of our types, and likewise for users defining new types, but it avoids magic.

Issues (with either approach):

  • Loading/saving: it's tempting to save the state as if the initial values came from normal operations. But that may then cause confusion when you load, since the initial values will already be present, but might be duplicated (or deliberately removed, by later operations) in the saveData.
  • canGc must take into account the initial values: canGc() can be true only if the Crdt is in its initial state including the initial values, so that if it is deleted and reconstructed, it will end up the same. E.g. see CCounter. This might get tricky for collections, and may lead to use storing the initial values forever just so we can check canGc().
  • Resettable: similar issues to canGc(). reset() must put the Crdt in its canGc() state, i.e., including the initial values.

As part of this, we should refactor DeletingMutCSet so that its initial values are set using their arguments + valueConstructor, not the values themselves.

I'm currently in favor of the load-hack approach that has been mentioned for Yjs and Automerge: create a replica with id "INIT" that performs the desired initial ops, save its state, then load that on all replicas (if there is no other state to load edit: okay to do regardless as long as you do it before loading other states, thanks to merging). This is easy and not too abstraction-busting. It's already possible (example), but we need to document it and could add helper functions.

Previously, I had tried adding initialValue args to Collab constructors, but this has downsides:

  • It requires making a second parallel API to describe operations (the initialValue args).
  • It is difficult to implement for some Collabs, like CSet.

Note that this would not work for setting initial values in dynamically-created Collabs (e.g., CSet elements). However, then the creating replica (e.g,. CSet.add caller) can just do initial ops to create that initial value. That still doesn't work for CLazyMap values (which do not have an explicit creator), but I do not yet see a use case for CLazyMap values with an initial state.

#247 adds "recipes" for how to do initial values (and how not to do it). I don't think any API support is needed, since the even the most complicated option (load-hack) is only a few lines of code.