Idea: A tree-aware task scheduler

Question

Idea: A tree-aware task scheduler

justinfagnani opened this issue 3 months ago · comments

Idea

I think it may be very useful to have a way to schedule DOM update tasks in a way that the DOM will take care of executing them in tree-order.

Something like:

interface Element {
  queueDOMUpdate(callback: () => void): Promise<void>;
}

Why

Batched and ordered DOM updates are important for performance and often correctness. Currently this batching and ordering most often is either centralized by frameworks, or distributed but emergently coordinated in common web components libraries.

The emergent order of custom elements generally happens this way:

Initial render
- The browser schedules CE reactions for new elements from parsed HTML in tree order
- Instances enqueue an render microtask in their constructor or connectedCallback
- So the microtask queue is tree ordered
- The render tasks create more child elements, whose CE reactions are queued, which then enqueue more microtasks...
Updates
- Some state for an element is changed (attributeChangedCallback, a JS setter or method call)
- The element schedules an update microtask
- The microtask may update children, which enqueue their own update microtasks...

This pattern results in top-down rendering when we have top-down data flow, with the only coordination point being use of the global microtask queue resource.

It's robust to common DOM update patterns too, especially those using DOM events. A element may handle some user input and dispatch an event. The listener of the event may control some state that it changes in response. The new state flows down the tree, triggering updates in tree-order.

Tree-ordered rendering doesn't always emerge in situations where you have cross-tree shared observable state though. You may have a central data store (Redux, MobX, etc.) and if components aren't notified of changes in the order that they subscribe, or they manage to subscribe in out-of-tree-order (either because of upgrades, conditional subscriptions, or tree modifications) then children may be notified before parents.

This can cause a child to update twice if it depends both on the global store and passed-down state that the parent derives from the global store. It also can cause children to update that are removed by the parent due to the state change.

Currently, component/global-state integrations would have to take care to preserve tree-ordered notifications, and that ordering would only be maintained for that integration library - it likely wouldn't extend to other integrations, direct use of the global store, or other shared observable state.

And these types of cross-tree state updates may become much more common if signals are standardized in JavaScript.

Potential solution

What could help is if all components scheduled their update tasks with a central scheduler that guaranteed that tasks would run in tree order (specifics like what window the tasks have to be scheduled in, how this interacts with the microtask or task queues, purposefully left out).

Adding this tree-aware of scheduling to the DOM would achieve a few things:

Relieve responsibilities from frameworks and custom element libraries
Increase interoperability between frameworks when mixed in a single tree
Make it possible for independently built custom elements to get tree-ordered updates even with cross-tree state changes.

Additionally, a DOM scheduler seems like a good primitive to have for declarative custom elements and template instantiation. For example, the timing of when a declarative updates when it's state changes, can be defined by it calling queueDOMUpdate().

Steve Orvell · Answer 1 · Thu Apr 25 2024 09:14:06 GMT+0800 (China Standard Time)

It's probably also worth noting that the lack of a canonical answer for how to schedule these types of DOM based effects is a big reason that "effects" are left out of the current Signals proposal.

Also, it seems possible to construct, and may be wroth doing, a tree ordered scheduler using Signals as a proof of concept for this idea.

Justin Fagnani · Answer 2 · Fri Apr 26 2024 00:00:11 GMT+0800 (China Standard Time)

Yes, Signals integration is the thing that pushed me to file this issue.

Trying to solve the ordering problem on the Signals side is very difficult, because we would need to introduce an idea of a tree-structured ownership of watchers, and either a built-in scheduler that notifies the watchers in order, or allows them to schedule effects in order. Then you would need that to be mutable to reflect DOM updates, and maintain possibly some ownership or current watcher context across async boundaries with AsyncContext. It's quite a lot to add to Signals and I think it wouldn't get done.

But on the DOM side of things we already have a tree that always accurately reflects the DOM - the DOM tree itself. We wouldn't need to possibly implicitly build a tree based on async context like might be necessary with signals. And there are many non-Signals use cases that a scheduling API could help with, so a solution would be more general. Being both easier and more general suggests this is the right place for such a thing.

If we had this, then signals integration would be pretty straight forward.

Justin Fagnani · Answer 3 · Fri Apr 26 2024 01:05:07 GMT+0800 (China Standard Time)

I think talking about techniques like scheduling microtasks and how a queueDOMUpdate() method might integrate with Signals can be a bit abstract, so I made some code examples. These presume a particular API, but just for example's sake. I know it's too early to get this concrete in the solution space before people agree on the problem and use cases.

This is an example of an element that reads two numbers, x and y and displays them and their sum. We want to batch the updates so that there's only one if both x and y change.

Note: These examples does not fully show the problem with microtasks though: to do that we would need both shared signals and data passed down the tree to show the situation where the children update before the parent, then update again after the parent passes new data down.

These examples just demonstrate the techniques I'm referring to. I sketched up examples with no scheduler (microtasks), one with a hypothetical DOM scheduler, and one with hypothetical DOM scheduler and signals.

With Microtasks

class MyElement extends HTMLElement {
  #isUpdatePending = false;
  
  #x: string;
  set x(v) {
    this.#x = v;
    this.#requestUpdate();
  }
  get x() { return this.#x; }

  #y: string;
  set x(v) {
    this.#y = v;
    this.#requestUpdate();
  }
  get y() { return this.#y; }

  constructor() {
    super();
    this.attachShadow({mode: 'open'});
    this.shadowRoot.innerHTML = `
      <span id="x"></span> + <span id="y"></span> = <span id="sum"></span>;
    `;
  }
  
  async #requestUpdate() {
    if (this.#isUpdatePending === true) {
      return;
    }
    await 0;
    this.#isUpdatePending = false;
    this.shadowRoot.getElementById('x').textContent = this.x;
    this.shadowRoot.getElementById('y').textContent = this.y;
    this.shadowRoot.getElementById('sum').textContent = this.x + this.y;
  }
}

With queueDOMUpdate()

class MyElement extends HTMLElement {
  #x: string;
  set x(v) {
    this.#x = v;
    this.queueDomUpdate(this.#update);
  }
  get x() { return this.#x; }
  
  #y: string;
  set x(v) {
    this.#y = v;
    this.queueDomUpdate(this.#update);
  }
  get y() { return this.#y; }

  constructor() {
    super();
    this.attachShadow({mode: 'open'});
    this.shadowRoot.innerHTML = `
      <span id="x"></span> + <span id="y"></span> = <span id="sum"></span>;
    `;
  }
  
  #update() {
    this.shadowRoot.getElementById('x').textContent = this.x;
    this.shadowRoot.getElementById('y').textContent = this.y;
    this.shadowRoot.getElementById('sum').textContent = this.x + this.y;
  }
}

With Signals and queueDOMUpdate()

To keep this example simpler, I replaced the instance fields with shared Signals. A real element might be more complicated that this - accepting values as fields or signals, or allowing passing signals via fields.

This code is likely a bit buggy, since I'm just typing it out. I left out fine-grained DOM updates.

const x = new Signal.State(2);
const y = new Signal.State(3);
const sum = new Signal.Computed(() => x.get() + y.get());

class MyElement extends HTMLElement {
  constructor() {
    super();
    this.attachShadow({mode: 'open'});
    this.shadowRoot.innerHTML = `
      <span id="x"></span> + <span id="y"></span> = <span id="sum"></span>;
    `;
    
    // With Signals you want to separate reading your data from side-effects, so
    // this computed just reads all the data that is used. A template system might
    // do this separation for you, and/or a fine-grained approach might use a separate watcher
    // and computed per "binding", or use a proposed `notifiedBy` feature for watchers.
    const viewData = new Signal.Computed(() => ({
      x: x.get(),
      y: y.get(),
      sum: sum.get(),
    }));

    const watcher = new Signal.subtle.Watcher(() => {
      this.queueDOMUpdate(() => {
        // The actual side-effects
        const {x, y, sum} = viewData.get();
        this.shadowRoot.getElementById('x').textContent = x;
        this.shadowRoot.getElementById('y').textContent = y;
        this.shadowRoot.getElementById('sum').textContent = sum;
      });
    });
    watcher.watch(viewData);
    viewData.get();
  }  
}

Rob Eisenberg · Answer 4 · Fri Apr 26 2024 01:11:52 GMT+0800 (China Standard Time)

Do you think we could make this available on Comment and Text in addition to Element?

Steve Orvell · Answer 5 · Fri Apr 26 2024 02:42:07 GMT+0800 (China Standard Time)

I think scheduled tasks must be able to be flushed by default before paint. However, I wonder if it's worth trying to support tasks that should span frames, for example, rendering long lists.

Steve Orvell · Answer 6 · Fri Apr 26 2024 02:46:06 GMT+0800 (China Standard Time)

There's another potential need to schedule work after your subtree does something. Can/should this be considered as well?

Naively:

host.child.x = 5;
host.child.dependsOnProcessingX();

Justin Fagnani · Answer 7 · Fri Apr 26 2024 04:59:00 GMT+0800 (China Standard Time)

@sorvell yes, there are use cases for wanting to update after an element's parents and previous siblings have updated, and occasionally needing to run some task after an element's children have updated.

This second use case is quite difficult to address with current async/batching rendering techniques, so it would be pretty amazing if we could solve both.

Ryosuke Niwa · Answer 8 · Fri Apr 26 2024 05:29:08 GMT+0800 (China Standard Time)

So the idea here is to queue a task per element/node and then some batching mechanism to invoke all these tasks in the tree order (preorder DFS) or reverse tree order (postorder DFS)?

Justin Fagnani · Answer 9 · Fri Apr 26 2024 05:36:49 GMT+0800 (China Standard Time)

@rniwa exactly

Ryosuke Niwa · Answer 10 · Fri Apr 26 2024 05:43:11 GMT+0800 (China Standard Time)

@rniwa exactly

The timing of batching seems like a key question here then. It could be once per a microtask, a task, or an animation frame.

Justin Fagnani · Answer 11 · Fri Apr 26 2024 06:39:19 GMT+0800 (China Standard Time)

@rniwa exactly

The timing of batching seems like a key question here then.

Yeah, that's going to be the bulk of figuring out a proposal I think.

It could be once per a microtask, a task, or an animation frame.

Once per microtask is an intriguing option. This would make waiting for a subtree to render potentially not require another API, since you could just enqueue a microtask for that. Of course, you can already wait for a rAF, but that sounds like it has occasionally caused problems for some people.

Another question is whether or not the update queue could be flushed synchronously. We've had developers want to use .offsetHeight and friends, but they don't work correctly with pending async updates. ResizeObserver might have obsoleted these cases. Still, it would be cool if we could get bother batching and make synchronous measurement APIs flush (or maybe throw if there are pending tasks in the subtree!).

Yet another question is whether we need priorities, and if so should they be compatible with the Prioritized Task Scheduling proposal: https://wicg.github.io/scheduling-apis/ In my past experience it's been easy to do lower priority scheduling in userland, but it might be good to integrate lower-priority tasks if we have some kind of flushing or pending state query feature.

Justin Fagnani · Answer 12 · Sat Apr 27 2024 02:36:50 GMT+0800 (China Standard Time)

fwiw, out of some historical curiosity and some different ideas in the area from the past: I once floated an idea to some Chrome colleagues to add distinct async read/write phases to the DOM. This was based on the Fast DOM approach and Dimitri Glazkov's nope.js (which threw on sync layout flushing APIs): you wait for the whole DOM to enter a read or write phase, then perform whatever operations you need for that phase.

I've lost access to the internal doc where I wrote it up, but this is a extremely brief description and prototype of the idea: https://github.com/PolymerLabs/async-dom

At the time this was a slightly popular concept as a way to fix DOM performance footguns.

Looking back, I think the scheduler idea is much better, because it's much more explicit.

Instead of waiting for the DOM to get in some state, you're telling the DOM that you want to do this task when the DOM is ready for it. That's what allows the DOM to do tree-order scheduling and potentially sooner-than-microtask timing, flushing the queue, introspection, etc. Also, queueDOMUpdate() doesn't necessarily need to return a Promise, which would eliminate the allocations.

smaug---- · Answer 13 · Sat Apr 27 2024 18:58:36 GMT+0800 (China Standard Time)

What does "run in tree order" mean if a callback mutates the DOM structure? Would the implementation need to check tree order always right before calling a callback?

Adding more native -> JS callbacks does itself add overhead, so need to be careful that the approach actually improves performance.

Joe Pea · Answer 14 · Sun Apr 28 2024 13:36:02 GMT+0800 (China Standard Time)

It's probably also worth noting that the lack of a canonical answer for how to schedule these types of DOM based effects is a big reason that "effects" are left out of the current Signals proposal.

Signals have nothing to do with DOM. There's no reason a microtask effect couldn't ship by default (and I'd use it for DOM! And 3D rendering! And it would be absolutely great!), and there's plenty of opportunity to add alternative types of effects later (if needed, because people might adjust their programs to be implemented differently if they're given a narrow space to operate in at first, and it might actually be beneficial, or at least informing, as to what types of new primitives or effects need to be added to spec later).

Tito · Answer 15 · Mon Apr 29 2024 03:50:12 GMT+0800 (China Standard Time)

This is not needed, the browser already has this as appendChild/removeChild/insertBefore.

It is also not solving anything, because after the first run every tree is disconnected from each other and nobody knows which random thing we will do. I will explain.

Lets say we have the hypothetical situation of

<div id="1"></div>

function Component(){
 return <div id="2"></div>
}

<div id="3"></div>

seems to me you are proposing to solve this by something similar to

<div id="1"></div>

queueDOMUpdate(function Component(){
 return <div id="2"></div>
})

<div id="3"></div>

That will work only once. Lets imagine I have this other situation

<div id="1"></div>

queueDOMUpdate(function Component(shouldShowSignal){
 return shouldShowSignal() ? null : <div id="2"></div>
})

<div id="3"></div>

Lets say shouldShowSignal is false, so queueDOMUpdate prints nothing to the DOM. Now, if the signal changes to true, what does queueDOMUpdate does? Nothing because already ran long ago, and if you run queueDOMUpdate inside the component it will append the element at the end of the tree which is not where we want to append it . So how do we solve this problem? With a placeholder.

<div id="1"></div>

<placeholder id="2"></div>

function Component(){
 return insertBefore(<div id="2.1"></div>, placeholder)
)

<div id="3"></div>

If the signal toggles to other values, you can imagine the insertBefore works as expected and queueDOMUpdate has nothing to do. You cannot rely on a microtask to insert the element at the right place, you have to use placeholders. The scheduling problem is imaginary.

Joe Pea · Answer 16 · Mon Apr 29 2024 04:40:19 GMT+0800 (China Standard Time)

I think that might be a different problem @titoBouzout, basically relating to the creation order of component trees. Your example might not consider what happens with updates over time, and I think that what @justinfagnani is alluding to is that something like this psuedo code,

<div id="1"></div>

queueDOMUpdate(function Component(shouldShowSignal){
 return shouldShowSignal() ? null : <div id="2"></div>
})

<div id="3"></div>

would actually be more like the following where an update mechanism (f.e. an effect, or update method in Lit, or function component re-run in React, etc) would be scheduled in a way such that the ordering of reactions (effects, update calls, component re-runs, etc) happen in tree order:

<div id="1"></div>

<placeholder id="placeholder" />

function Component(shouldShowSignal) {
  createDOMUpdateEffect(() => { // using Solid.js terminology here
    placeholder.innerHTML = "" // clear
    placeholder.append(shouldShowSignal() ? null : <div id="2"></div>)
  })
})

<div id="3"></div>

The main idea here is that, something like createDOMUpdateEffect could be basically an effect just like in Solid, but scheduled to the next microtask and executed in tree-order relative to any other such effects (as opposed to simply added to a queue with queueMicrotask where the effects would be called essentially in the order in which state changed).

And the reason Justin wants this is so that if

child component/element property is updated
parent component/element property is updated

then the effects will run within the next microtask in the guaranteed order of

the createDOMUpdateEffect in the parent
the createDOMUpdateEffect in the child

instead of an ordering like

the createDOMUpdateEffect in the child
the createDOMUpdateEffect in the parent

Basically the tree order would be determining the order of queued reactions (imagine the same thing but scheduling update() calls in Lit instead of effects Solid or Lume Element, or scheduling component re-runs in frameworks like React or Preact, etc).

The result would be that, no matter what, if both parent and child were modified, and for example the parent will remove the child due to its state change, the effect for the child could be cleaned up and never run (whereas if it ran first, it would be wasteful).

While I understand this desire, I feel like this is an edge case that is much too specific to DOM and certain ways of implementing custom elements (or components), and is completely unnecessary to block shipping effects for sake of handling one specific way of writing DOM code.

Microtask effects will not be in tree order, and they will serve many people well.

In my custom elements, I can handle this use case easily, even with synchronous-non-microtask effects:

connectedCallback() {
  super.connectedCallback()
  this.createEffect(() => {
    someValue()
    const raf = requestAnimationFrame(() => {
      // ...heavy update with someValue...
    })
    onCleanup(() => cancelAnimationFrame(raf)) // cleanup runs on disconnect, not just before any re-run triggered by someValue()
  })
}

where this.createEffect is a wrapper around Solid.js createEffect with the added convenience that it cleans up on disconnectedCallback automatically. If I have expensive rendering logic that should be batched, I can easily do that by batching it into a requestAnimationFrame, even if the effect is synchronous which in Solid.js 1.0 it historically has been (though microtask effects would be much more ideal).

With this pattern, I can easily, for example, immediately remove children, and the next requestAnimationFrame won't have expensive logic for the removed child (even if the child's data changed before the parent).

All I'm saying is, I don't think we need this complexity to ship some sort of basic effect that will be super useful to a lot of people. This specific type of scheduling desire is valid, but it is very specific, not generic.

Furthermore, something like queueDOMMicrotask can easily be combined with a basic microtask-effect to achieve the same purpose.

In the next example, I'll use a hypothetical API named createMicrotaskEffect to denote that it is a simple effect API that simply queues a single reaction in the next microtask, but it could just as easily be replaced with a createSynchronousEffect that creates an effect that runs immediate on signal change:

let queued = false

createMicrotaskEffect(() => {
  // dependencies
  signalOne()
  signalTwo()

  if (queued) return
  queued = true
  this.queueDOMUpdate(() => { // `this` is a custom element, for example
    queued = false

    // run tree-order logic here (runs in the next microtask, for example)
  })
})

And just like that, by pairing both APIs together, we've solved the problem, without requiring that effects have special types of scheduling. Replace createMicrotaskEffect with createSynchronousEffect and it works just the same.

What would be useful is to make the Effect API flexible enough to easily be able to create new Effect APIs like createDOMUpdateEffect out of simpler effects like createMicrotaskEffect.

Basically, we don't need to block effects on anything scheduling requirement as specific as in the thread. And yeah, maybe queueDOMUpdate could be useful! (I've been fine writing WebGL apps (where modifying reactive properties can recreate a whole geometry and shader to expensively upload to the GPU again) using declarative templating within custom elements, and I've been fine without this type of scheduling, although I can admit it might be convenient in a few cases, but I just wouldn't want to stop us from shipping a basic effect that's entirely useful).

Tito · Answer 17 · Mon Apr 29 2024 05:25:12 GMT+0800 (China Standard Time)

Reactivity with signals doesn't suffer this problem, and as I illustrated, you cannot fix this by pushing it to the future. This problem manifest if you do stuff manually and/or try to rely on the order of things, people think too much on microtask stuff and that's a big mistake. The order is dictated by the resolution of the reactive graph not by the user. The only warranted order in reactivity (as it should be for performance reasons) is that cleanups run before re-executing. Stuff runs when the graph settles, if for the time it does so, your tree will update the wrong stuff, then what needs fixing is how you are doing things, not the browser. So there's no need to disambiguate something that is already disambiguate. with Signals you just do not rely on the order of things.

Steve Orvell · Answer 18 · Mon Apr 29 2024 21:37:58 GMT+0800 (China Standard Time)

This proposal is not related to signals. It's for scheduling tasks that can effect a DOM tree.

The justification is pretty straightforward: you want to do some arbitrary work on a parent and a child. If you do the work on the child first, the work on the parent might invalidate the work you do on the child. Therefore, it's better to schedule the work such that the parent work comes before the child work. This provides an opportunity to efficiently customize the child work based on its updated state. This is a common need in applications and frameworks. Putting the scheduler in the platform means this work can more easily be scheduled in an interoperable way.

Then why are signals mentioned at all here? They are simply one of many ways it's possible to have this need to accomplish a set of work spread across a series of nodes in the DOM tree.

Tito · Answer 19 · Mon Apr 29 2024 23:27:40 GMT+0800 (China Standard Time)

I understand now, I missed this is interfacing Element. It sort of makes sense, but you have to have a use case for it.

Steve Orvell · Answer 20 · Thu May 02 2024 23:55:21 GMT+0800 (China Standard Time)

Here is an example that attempts to show why a DOM based task scheduler is helpful.

The example includes a simple prototype of the proposed queueDOMUpdate API.

The use case is somewhat contrived but this type of issue can come up in a variety of ways in more complicated applications that manage a lot of shared state.

Daniel Ehrenberg · Answer 21 · Sat May 04 2024 08:41:52 GMT+0800 (China Standard Time)

I agree that the DOM will eventually be the right level to put this kind of scheduling, but I don’t understand how this primitive solves things. +1 to the relevance of @rniwa ’s and @smaug---- ’s questions, which I don’t know how to answer (there are multiple possible answers).

My understanding was that frameworks need things to run outside to inside because the inside is control-dependent on outer structures like conditionals, loops and suspense, which can insert, delete or replace elements. It’s not clear to me how control structures fit into this model (there are ways to do it, I just don’t know what you are proposing).