Support returning both a result and error from ExecuteActivity in a Cadence workflow

Question

Support returning both a result and error from ExecuteActivity in a Cadence workflow

dochsu opened this issue 4 years ago · comments

Is your feature request related to a problem? Please describe.
ExecuteActivity only captures the error or the result (if nil error) that is returned by the executing function. This follows idiomatic Go code where it will set each non-error value to a zero value.

To workaround we can:

Return the error in the result. This has 2 drawbacks.
a) The built-in go error is not serializable, you need custom marshaling
b) This turns off the retry mechanism that is supported by an Activity
Create a custom error to return the results which isn't as convenient

There is a strong use case to support breaking with the idiomatic approach because it will allow for a clean separation of error and result and require less contortion to write a workflow where a payload is needed to pass along if an error occurs.

Since the result is not returned on the left-hand side so it is less likely that a caller will ignore the error.

Proposed Solution
Allow both the result and error to be captured from a Cadence Activity

Bowei Xu · Answer 1 · Thu Jun 18 2020 07:44:32 GMT+0800 (China Standard Time)

Hi @dochsu ,
I just realized another thing after our discussion, making such change as the proposal will be a breaking change for Cadence. Consider existing workflow that assumes activity return zero val when error happens.
So I am inclined to keep it like existing.

@emrahs @meiliang86 Any thoughts on this?

David · Answer 2 · Thu Jun 18 2020 20:59:47 GMT+0800 (China Standard Time)

Is it possible to require applications to opt-in to this feature as a workflow option?

Liang Mei · Answer 3 · Fri Jun 19 2020 06:42:50 GMT+0800 (China Standard Time)

@dochsu I don't understand why you need to return both error and result.

Liang Mei · Answer 4 · Fri Jun 19 2020 06:43:26 GMT+0800 (China Standard Time)

You can return custom errors with all the details you want btw. Does it solve your use case?

David · Answer 5 · Fri Jun 19 2020 07:07:17 GMT+0800 (China Standard Time)

@meiliang86 Activities rely on error to initiate a retry. Even if the activity eventually fails, the activity might have details to pass back to caller for consumption in subsequent activities. Example, I may want to generate an email to send regardless of success or failure of the activity.

@meiliang86 A custom error might work but I already implemented a uniform return result I can use. Adding a custom error means I need to implement twice?

emrahs · Answer 6 · Fri Jun 19 2020 11:22:38 GMT+0800 (China Standard Time)

@dochsu I don't quite follow the example you gave. What kind of details do you want to return to your workflow when your activity fails? If the details you wish to return are related to the error, you can put it into the Reason and Details fields of the error, and use them for decision making. If you want to return something other than error-related information, you are probably overcomplicating your workflow and activity.

With my current understanding of your proposal, I don't think it would be a good addition to Cadence. Here are a few reasons for why:

First of all, this would be a breaking change, as @vancexu pointed out. There should be a major benefit for a change like this in order to justify the overhead of making it backward compatible.
When you return a result or error from an Activity, it's not guaranteed to be delivered to the workflow all the time. For instance, the output may get lost in the network, or the activity may timeout before its output is received. Furthermore, if your activity is retried, your workflow only receives the last failure of the activity or receives no failures at all if one of the retries succeeds. Given these cases, it's fair to say that you can't count on having a result in many failure scenarios. Therefore, it'd be good if you can eliminate partial-success scenarios and handle all errors with a holistic approach in a consistent and simple way.
The "result or error" contract discourages creating monolithic Activities, and that's often a good thing. If an Activity is complex enough to make it challenging to represent the outcome using this simple contract, you should consider breaking it up or utilizing other Cadence features to see if they help (i.e., activity heartbeats, or non-retriable errors).
Even outside Cadence, it's unconventional to return both an error and data at the same time because it can make the contract between the caller and the callee very ambiguous. It's an anti-pattern that we shouldn't aim to optimize for.

Liang Mei · Answer 7 · Sat Jun 20 2020 05:12:42 GMT+0800 (China Standard Time)

@dochsu Seems like the simplest thing you can do in this case is to return a result with all the information and not an error. Then you do your if else logic based on that. We are going to stick to golang's recommended way of handling results and errors here.

David · Answer 8 · Mon Jun 22 2020 23:07:09 GMT+0800 (China Standard Time)

@emrahs

I'm trying to decompose a monolithic activity into more discrete smaller reusable activities. This application is already in prod (I did not write the original version). I want to manage my risk and the first step is to decompose the main activity into an execute and send email activity. The send email activity takes a param that is generated by the main activity. We always want to send an email even if the main activity fails and the main activity is the one right now that has all the info needed to generate the mail params. In further decompositions that may no longer be the case but right now that's the first step.

Enforcing this idiom makes this more difficult. It also means we need to continue to use our own retry mechanism which we were hoping to replace eventually with Cadence's Activity retry.

With regards to #3, in practice this seems to encourage monolithic activities. I'm confused because there isn't any guidance in the Cadence documentation for how to handle this type of workflow.

#4 - I'm new to Go itself but am aware of this idiom. My understanding is that it isn't enforced by the language itself and in other languages this is generally dealt with using exceptions which force the application to break flow. The Go equivalent here seems to be panics.

With all that said, I will experiment with @meiliang86 suggestion and see if this is reasonable approach to take.