Allow adaptive limits
cep21 opened this issue · comments
Problem
In the basic case, we want to time out or limit the rare bad request so we can maintain a good SLA. However, when problems happen (maybe the database takes 110ms rather than 100ms for all requests because of a DB issue), we don't want to fail 100% of requests and would rather increase our timeout by a bit while requests are slow, and move it back down when things normalize.
Idea
Move all limits from static numbers to (min/max/rate of change). For example, you could have a timeout normally at 100ms, but allow it to increase by 10ms per some unit if requests are slower than 100ms, but not allow requests to ever be slower than 300ms. Then, when things settle down, allow requests to timeout at 100ms again.
Solution
Circuit open/close logic is defined inside https://github.com/cep21/circuit/blob/master/closers.go#L9 and they listen to all the events on https://github.com/cep21/circuit/blob/master/metrics.go#L164
The function ShouldOpen
is called when a circuit decides if it should open: https://github.com/cep21/circuit/blob/master/closers.go#L14
Right now, for hystrix, we open directly on error percentage https://github.com/cep21/circuit/blob/master/closers/hystrix/opener.go#L140
Instead of opening on some threshold, it could detect why the circuit is failing (if it is because of too many timeouts or concurrency limits). If it is, it would modify the thread safe config on the circuit https://github.com/cep21/circuit/blob/master/circuit.go#L71 to increase the timeout. On concurrent Success
, we can inspect the timeouts and lower the limit if things recover.
Similarly, on ErrConcurrencyLimitReject
calls, we could increase the concurrency limits up to a point, and decrease it on Success without ErrInterrupt.
Implementation
Make another package inside https://github.com/cep21/circuit/tree/master/closers called hystrix-adaptive.
It uses composition to include the hystrix package, but change ShouldOpen to be adaptive.