Augmenting AST with parent pointers

Question

Augmenting AST with parent pointers

LeaVerou opened this issue 8 months ago · comments

I pushed a two functions around setting pointers to parent nodes:

parents.set() sets the parent on a specific node, to another provided node
parents.setAll() walks an AST setting parent references recursively.

Both are in src/parents.js.

Right now they set regular properties on the nodes themselves, that are just non-enumerable. This maximizes the DX of reading the parent pointers, so it's a good way forwards if devs using vastly also want to read parents.
An alternative design would be to use a WeakMap to map objects to parents, and provide a parent.get() function to read it. This has the advantage that it does not mutate the AST nodes (and the DX is still not totally horrible).

By default they skip nodes that already have parent references (parents.setAll() skips the entire subtree under the node), so repeated use should be reasonably performant.

Now the question is, when to use these functions internally.
Parent pointers are necessary in a number of vastly functions.

Some of these functions take an entire AST, so they can also call parents.setAll() on it (e.g. the function @adamjanicki2 is working on),
However, others require it to have been called beforehand (e.g. closest()) and cannot do much otherwise.

Factors to consider

Usability: Ideally we don't want our users to have to think about calling some separate function before they can use other functions, things should just work.
If a function that is supposed to do something else also adds references, this breaks the principle of least surprise, and could make what seemed like a side effect free function into a function with (opaque) side effects. Most use cases would likely be fine with that, but there are some that would require objects to remain untouched. That can be mitigated if we use the alternative design discussed above.
While the functions are designed to be reasonably performant, there is some overhead if every vastly function also sets parent nodes.
As a design principle, these functions should be separate. It's entirely possible authors will use vastly without ever needing to call a function that requires parent references.

Options

(Not all mutually exclusive)

Every function also calls parents.setAll() on any node they get, to maximize the odds that the reference is available when needed.
- Pros:
  - Things just work™
- Cons:
  - …except when they don't. 😄 As with many good heuristics, this would work well in most cases, but it would be confusing to understand what happened when it doesn't.
  - Breaks modularization, functions doing more that they really need to
We could have a parse() function where authors can set a default parser as a parameter, and then it parses strings using that, and also sets parent references. They could either use parse() and have an AST that just works with every vastly function automatically, or use whatever other thing and then they need to do more work.
- Pros:
  - in line with simple things being easy, and complex things being possible.
- Cons:
  - Lack of flexibility
Every function that requires a parent reference, also accepts the whole AST as a separate parameter, so that parent references can be set as needed.
- Pros:
  - Lean approach, nobody does more work than they need to
- Cons:
  - Providing the whole AST may not always be possible (but it doesn't have to be an either or, this can be an optional parameter so it can be combined with other approaches)
We clearly document which functions require parent references to be set, and have users do it.
- Pros:
  - No surprises
  - Maximum modularity
- Cons:
  - Error-prone, easy to forget
  - Confusing to understand why some functions require this prep work