Side effects / Workers / LifecycleWorker methods not called if rendered/not rendered without ever yielding dispatch

Question

Side effects / Workers / LifecycleWorker methods not called if rendered/not rendered without ever yielding dispatch

steve-the-edwards opened this issue 9 months ago · comments

since worker is built on runningSideEffect, the easier way to reproduce this is:

 runningSideEffect("test") {
   try {
   } finally {
     error()
   }
}

If this is rendered and not rendered within the same thread frame (e.g. using the Main.Immediate dispatcher and cycling back and forth), error() will never get called.

At the very least this needs to be updated in the documentation.

This is because the Job's for side effects are started using CoroutineStart.LAZY so that we don't end up sending anything to the actionSink until after the render() method is complete and the RenderContext frozen (then we call start() on all the side effect jobs - see here.

    baseRenderContext.unfreeze()
    val rendering = interceptor.intercept(workflow, this)
      .render(props, state, context)
    baseRenderContext.freeze()

    // Tear down workflows and workers that are obsolete.
    subtreeManager.commitRenderedChildren()
    // Side effect jobs are launched lazily, since they can send actions to the sink, and can only
    // be started after context is frozen.
    sideEffects.forEachStaging { it.job.start() }
    sideEffects.commitStaging { it.job.cancel() }

What would be better if we provided a hook to the side effect's Job's invokeOnCompletion which will be invoked even if cancelled before the coroutine gets dispatched. E.g. see this test:

fun main() = runBlocking {

    val job = launch(start = CoroutineStart.LAZY) {
        try {
            repeat(1000) { i ->
                println("job: I'm sleeping $i ...")
                delay(500L)
            }
        } finally {
            println("I'm still run in the finally block.")
        }
    }
    job.invokeOnCompletion { cause ->
        println("I'm complete! $cause")
    }
    job.start()
    //yield()
    job.cancel()
    println("done test.")
}

Which prints "I'm complete" but not what is in the finally block (without the yield()).

I propose this could be called.

RenderContext.onNoLongerRendered(). It has to be outside of runningSideEffect's lambda because that won't ever get invoked. We want to attach this right on the Job we create for the side effect.

Stephen Edwards · Answer 1 · Thu Aug 03 2023 03:17:31 GMT+0800 (China Standard Time)

Am I overthinking and we should provide a Workflow node related lifecycle hook for this and leave the job out of it? not sure.

Ray Ryan · Answer 2 · Thu Aug 03 2023 03:22:09 GMT+0800 (China Standard Time)

In the test, w/o yield, do any of the I'm sleeping lines print?

Ray Ryan · Answer 3 · Thu Aug 03 2023 04:41:27 GMT+0800 (China Standard Time)

Seems like we have a bigger problem than this ticket implies: runningSideEffect {} does not fire at all in this scenario. Rather than providing a new teardown hook seems like we should see if we can guarantee that it will run.

Ray Ryan · Answer 4 · Thu Aug 10 2023 00:17:55 GMT+0800 (China Standard Time)

For the record: we just chatted, and we think that using ATOMIC instead of LAZY when launching side effects is the simpler and safer fix than finding just the right spot to sprinkle yield() dust. ATOMIC seems designed for exactly this use case, ensuring that something cannot be canceled before it is started.

https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-coroutine-start/-a-t-o-m-i-c

We believe that LAZY was used because originally Workers had their own dispatcher, and we needed to ensure they didn't get ahead of things. That stopped being the case at some point, probably when they were reimplemented via runningSideEffect, so this should be a safe switch.

Stephen Edwards · Answer 5 · Fri Aug 11 2023 23:10:18 GMT+0800 (China Standard Time)

I'm no longer thinking this is the case. When using ATOMIC we have trouble with any dispatcher using an EventLoop (such as Unconfined or Main.immediate, as when nested continuations are used on these dispatchers (as with Workflow!) we lose guarantees over the ordering of dispatch. (see https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-dispatchers/-unconfined.html for more info). Also see https://kotlinlang.org/api/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/-main-coroutine-dispatcher/immediate.html.

So when we use ATOMIC we can get side effects using the RenderContext.actionSink before the render() is complete and the context is frozen.

To get around that we would have to use a lock at the start of every side effect until render is complete. While this would mean that some coroutine for a side effect was always started, it wouldn't matter as the suspend CoroutineScope.() -> Unit that was passed to runningSideEffect would not get invoked in the case above where we are talking about rendering and then not rendering before we can dispatch the side effect. So its just the same behavior as LAZY but implementing it ourselves.

I'm more convinced now that we should let the coroutines behavior - of respecting cancellation before dispatch - win out here and impact how side effects are run.

We can add a new API for this use case though - a non-suspending lambda for when the side effect is complete that we will attach to the Job.invokeOnCompletion {}.

e.g. somehting like:

runningSideEffect (key = "key", onCompletion = { // my non-suspending cleanup } ) { 
  // my suspending side effect
}

Stephen Edwards · Answer 6 · Sat Aug 12 2023 00:16:17 GMT+0800 (China Standard Time)

I actually can't yet figure out how to make a test that will reproduce the nested EventLoop scenario that causes the out of order dispatch (where we use an unfrozen render context). The scenario in the PerformancePoetryActivity does reproduce this, but I can't isolate what about it does that in a unit test.

So perhaps the scenario where the lock prevents the side effect from being run is less frequent than we thought?

Stephen Edwards · Answer 7 · Sat Aug 12 2023 00:21:41 GMT+0800 (China Standard Time)

interesting - the lock does not check for cancellation if it doesn't suspend - see documentation here:

Note that this function does not check for cancellation when it is not suspended. Use yield or CoroutineScope.isActive to periodically check for cancellation in tight loops if needed.

So this is actually perfect for us. As the side effect will continue to run (atomically) until its first real suspension point, if render has completed before dispatch occurs.

Only in the mysterious cases where it hasn't completed and the lock is locked when we try to acquire it, then it will suspend. So i'm back to thinking this actually may be a good solution.

Stephen Edwards · Answer 8 · Sat Aug 12 2023 01:57:27 GMT+0800 (China Standard Time)

So that will work for almost all cases. The nested cases when using an Unconfined or immediate dispatcher will likely result in suspending on the lock waiting for render complete, so in those cases the side effect won't be fired after cancel().

I should be clearer about what is working - here is a snippet:

val job = scope.launch(start = ATOMIC, block = {
      renderComplete.lock()
      sideEffect()
    })

I have a hunch that our internal use case is likely the nested scenario. Have not confirmed yet.

We could force the sideEffect to happen each time with NonCancellable but that's not what we want because then it would never get cancelled. e.g.,

THIS WON'T WORK:

   val job = scope.launch(start = ATOMIC, block = {
      try {
        renderComplete.lock()
      } finally {
       withContext(NonCancellable) {
         sideEffect()
       }
      }
    })

So, to cover all use cases I'm thinking about adding a param to the runningSideEffect API for onComplete: () -> Unit. It will not be a suspend fun as we don't want it to suspend. We could execute it in the finally block or in invokeOnComplete on the job.

We'd use it like:

   val job = scope.launch(start = ATOMIC, block = {
      try {
        renderComplete.lock()
        sideEffect()
      } finally {
        onComplete()
      }
    })

OR

   val job = scope.launch(start = ATOMIC, block = {
      renderComplete.lock()
      sideEffect()
    })
   job.invokeOnComplete(onComplete)

This should cover all of our use cases as well as do the best effort to honor starting the actual sideEffect() block (if it was ever declared running) in as many cases as possible.

It does beg the question though if we add an optional onComplete, why not just leave the previous behavior as is with starting via LAZY?

My answer to that right now would be that using an ATOMIC dispatch and waiting for the renderComplete.lock() more accurately represents Workflow semantics. We could be wrong about that though!