Schedule.every Doesn't compensate for time spent executing block
HCLarsen opened this issue · comments
I'm not sure if this is intended behaviour or not, but Schedule.every doesn't compensate for the time it takes to execute the block. For instance, if you have a block that takes 2 milliseconds to run, and you set the interval to 1 second, it doesn't run every second, it runs every 1.002 seconds. This can add up after a couple hundred iterations, meaning the time of your execution is inconsistent. Is this the way you intended it to run?
@HCLarsen I'm actually aware of this limitation, it's not the way I intended it to run and I never had to concern about this difference in my use cases. I agree that this has to be fixed, but I'm afraid that the runner has to be reworked.
You may not have to rework the entire runner. I tried writing a library like this myself before I realized that others had beat me to it. I was able to overcome the limitation(mostly) by calculating the delay from the time the iteration of the loop began. For instance:
next_time = Time.monotonic
loop do
next_time += interval_value
block.call
sleep [next_time - Time.monotonic, Time::Span.zero].max
end
WIth interval_value
being the Time::Span passed into the method.
@HCLarsen but this solution may cause desynchronization if the block.call
take more than interval_value
time, even for once. This might be a temporary solution, but I'd prefer to solve it but running on another fiber.
@hugoabonizio are you talking about running every block.call
inside its own fiber?
@HCLarsen something like that, but it's still required to catch the exceptions (inclusing the stop
and retry
exception). I have an implementation in mind using channels to handle the execution and retry, but didn't have time to work on it yet.
I look forward to seeing it.
Looking at the code, it's basically taking the interval and passing it to a sleep
. As there's a lot of needless duplication I'll start with refactoring that into something more reasonable, currently the code is like this:
def self.every(interval, &block)
spawn do
loop do
sleep interval
run(block)
end
end
end
def self.every(interval : Symbol, &block)
spawn do
loop do
sleep calculate_interval(interval)
run(block)
end
end
end
def self.every(interval : Symbol, at : String | Array, &block)
spawn do
loop do
sleep calculate_interval(interval, at)
run(block)
end
end
end
Before anything else I suggest refactoring this into:
def self.every(interval, &block)
spawn do
loop do
sleep interval
run(block)
end
end
end
def self.every(interval : Symbol, &block)
every(calculate_interval(interval), block)
end
def self.every(interval : Symbol, at : String | Array, &block)
every(calculate_interval(interval, at), block)
end
Now that the sleep
and run
only exist in one method we can concentrate on that. I'd prefer to have some way to opt-out of compensating for execution time, as that does incur some processing cost, but in my preliminary tests the change in execution time was insignificant, and it doesn't justify increasing the complexity of my example, so I'll just make it the default. There are a few options like using Time.monotonic
etc., but the easiest (And fastest, according to my very few and very small tests) way to implement this is with Time.measure
. The result would be something like this:
def self.every(interval, &block)
spawn do
loop do
duration = Time.measure do
run(block)
end
sleep interval - duration.total_seconds
end
end
end
I'd probably shorten it to:
def self.every(interval, &block)
spawn do
loop do
sleep interval - Time.measure{ run(block) }.total_seconds
end
end
end
But then again, I'd also change the whole spawn
and loop
thinggies to reduce nesting and it'd probably become something like:
def self.every(interval, &block)
spawn { loop { sleep interval - Time.measure{ run(block) }.total_seconds } }
end
which, let's face it, is overdoing it. Maybe a good compromise could be:
def self.every(interval, &block)
spawn { loop do
sleep interval - Time.measure{ run(block) }.total_seconds
end }
end
or:
def self.spawnLoop(&block)
spawn { loop { yield } }
end
def self.every(interval, &block)
spawnLoop do
sleep interval - Time.measure{ run(block) }.total_seconds
end
end
Anyways, Time.measure
worked so well for me that I thought it was worth sharing, and as I had come across this shard and this specific issue during my research I thought it'd be nice to share my findings.
In my case using a shard for this would be overkill (I wrote a pretty basic script to monitor some stuff, it only uses about 5Mb of RAM, so yeah, zero dependencies makes a lot of sense in my case), but maybe this can help others.