Actyx / machines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[machine-runner] Failing Command Call API Enhancement

Kelerchian opened this issue · comments

This issue aims to resolve how machine runners command call errors to the call site --- that is the code that awaits the call.
Also addressing #70 (comment)
Commonly, throw and Error class-based is the standard way of doing it.
However, this problem is complex due to several conditions that trigger an error as well as a breaking change of a significant degree of severity.

Triggers of error are:

  • State expiry
  • Locked command
  • Destroyed machine
  • Any errors are thrown by the Actyx SDK's publish function.

Currently, only the last one will throw errors, the rest don't.

Special property of Expiry: Not anyone's fault

Locked commands and destroyed machines result from bugs most likely.
Errors from SDK's publish functions are unavoidable (e.g. lost connection) and beyond the application's control.
However, state expiry can occur due to a peer's action, which is nobody's fault.

Below is an illustration that state expiry can happen in otherwise an innocent piece of code.

for await (const state of machine) {
  const whenMoving = state.as(Moving);
  if (whenMoving) {
    await actuateMove();           // This is a long-running async function, an incoming event triggers expiry here
    await whenMoving.commands?.doneMoving();         // When called, the machine warns "expiry"
  }
}

Because of this, throwing async errors (i.e. rejected promise) will make the pattern above obsolete.
Uncaught awaited rejected promise will break the control flow out of the for await loop.
To counter the rejection the application developer must either:

  • Add .catch(e => e instanceof MachineRunnerErrorCommandFiredAfterExpired ? Promise.resolve(e) : Promise.reject(e))
  • Add the try-catch equivalent of the previous point. This try-catch block must be exclusive.
  • Limit the pre-command local actions (e.g. in this case await actuateMove()) to be idempotent---that is having the same effect when called once or n-times to counter the breaking behavior.

Since the above pattern is the machine-runner's only and recommended pattern of use, making state expiry a thrown error will force the user to apply one of the above fixes to their entire codebase.

Potential Solutions

  1. All errors except expiry throws. An additional config can be added later.
  2. Wrap StateOpaque and State with a Proxy to intercept command property access and return undefined when expired (may affect performance)
  3. Use WeakRef to set and unset command property of State and StateOpaque. (limits machine-runner to only the latest browsers and NodeJS version because WeakRef is relatively new)

Before discussing implementation details let’s settle on what the code should mean:

for await (...) {
  ...
  if (whenMoving) { // reaching this state means we need to do something
    await actuateMove() // so we do it
    await whenMoving.commands?.doneMoving(); // and we tell everyone else about it
    // now we take a look at the next state and take it from there
  }
  ...
}

To me these intuitions make a lot of sense, so we should try to match them as closely as we can with our API semantics. Ideally, if the .doneMoving() command is still available on the current state (not necessarily on whenMoving due to expiry) we should invoke it. This could be done by making .commands a getter that refreshes the state with machine.get() in case the referenced state is expired.

The alternative is to officially bump the version to 0.5.0 and refactor it to whenMoving.commands()?.doneMoving().


I agree with the rest of your analysis regarding all other error sources.

Ah, that is a nice compromise.
I think there is one question left, but one that is easier to answer.

  if (whenMoving) {
    const commands = whenMoving.commands; // commands are generated here
    await actuateMove()                                        // expires here
    await commands?.doneMoving();                   // throw because of expiry
  }

Then we can say that this is not recommended and thus ExpiryError is the programmer's error.
What do you think?

Yes, good point, agreed.