caller timeout issue
goldcode88 opened this issue · comments
For example:
Long-duration callee need consume 10 seconds.
We set caller timeout = 30 seconds.
opts := wamp.Dict{wamp.OptTimeout: 30000}
result, err := caller.Call(ctx, "sum", opts, nil, nil, nil)
run callee
$./callee
And then, run caller
$./caller
After 2 seconds, kill callee, I found that the caller never timeout, where it hang on for ever.
try this code.
ctx := context.WithTimeout(context.Background(), 30)
kwArgs := wamp.Dict{
"wait": true,
}
result1, err1 = caller.Call(ctx,"sum", nil, nil, kwArgs, nil)
This was intentional, so that handling a timeout could be done either by the Callee or the Caller. The Callee can cancel the call when processing a request is taking too long, or the Caller can cancel the call if the Callee is taking too long to respond. Having the Dealer handle the call timeout requires a timer for every pending call. Each timer must be removed when the Callee responds or when the Caller or Callee cancel the call.
In my opinion, this complicates the Dealer without adding much value, since the Caller can always cancel the request. This nexus client makes this very convenient.
However... not everyone is using a nexus client, and a nexus router may be used with clients that do not support the call timeout feature. So maybe the Dealer should handle this?
Maybe @dcarbone and @cameronelliott can weigh in on this?
Well, the spec is pretty wide open about this. Timeouts
The spec allows for timeouts at any of: caller/callee/dealer
It doesn't mandate them handled anywhere, and the feature is in beta.
Both Wampy.js and Autobahn show examples of using js/setTimeout,
for using timers with Rpc, and I am doing the same. (So this might
be a good application level option)
I would second the idea of reduced complexity in the dealers codebase
around Rpc, and push this responsibility up to the caller/callee, or
even up to the application.
There are really two issues here:
- When a callee has been called, but has not yet returned a response and disconnects before a response is returned, then pending calls will hang until the caller decides to cancel them. PR #228 fixes that by canceling any pending calls to a callee that disconnects.
- When timeout option is specified with call, but callee does not support it or does not honor it (maybe callee hangs), then the call is not canceled until the caller decides to cancel it.
Ultimately it is best if the caller/app always handled the timeout and canceled the call. However, if that was the case, then there would be no need for the call_timeout feature at all. So, if a caller is using call_timeout, then it means that the caller cannot handle timing out and canceling the call, and is relying on call_timeout to take care of that. There is not guarantee that the callee will actually timeout and cancel the call, since it may be stuck. This means that for the call_timeout to be reliable, that the router MUST handle it.
By the above reasoning I am arriving at the conclusion that if the router (dealer) supports call_timeout, then the router must be able to automatically cancel the call at the timeout time. This is handled by PR #227
Both issues discussed here are fix in nexus v3.0.1.