sethvargo / go-retry

Go library for retrying with configurable backoffs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exponential and Fibonacci backoffs easily overflow

magnusbaeck opened this issue · comments

If you let exponential or Fibonacci backoffs run for a while you'll easily get into an overflow situation. See https://go.dev/play/p/FUNKkZU0Kyn for an example. The NewExponential documentation says

It's very efficient, but does not check for overflow, so ensure you bound the retry.

but AFAICT you can't prevent overflows with WithCappedDuration since by then the overflow has already happened. How about changing said backoff functions to detect overflows and consistently return the largest time.Duration when the ceiling has been reached? Or am I missing something?

If you can figure out a way to do that such that it's still safe for concurrent use and doesn't introduce races, I'd be very interested :)

Oooh, you want concurrency too. Should be feasible with a lock, or are those off limits?

All the backoffs are safe for concurrent use and need to remain that way. They currently use atomic operations and are incredibly fast. When I benchmarked it, introducing a mutex added a fairly large performance hit.

Percentage-wise I'm sure that's true, but you'd have to run a really tight retry loop for that to have any significance.

I'm not sure what other options there are. It's not entirely easy to make WithCappedDuration to detect overflows (except for the special case of negative durations), and then we're down to adjusting the documentation to be clearer. Right now it's easy to read it as "just use WithCappedDuration and you'll be fine".

I think WithCappedDuration, specifically, can check for negative times and then apply the max.

Yes, but retry.NewExponential(1 * time.Second) starts returning zero after a while (see https://go.dev/play/p/U0GEa6PRi4U) so we'd need a special case for that too. Bases must always be strictly greater than zero so applying the max for <= 0 would perhaps work out?

The problem is that Next() will continue to run, so eventually you'll stop overflowing

Thanks for fixing this!

This issue has been automatically locked since there has not been any
recent activity after it was closed. Please open a new issue for
related bugs.