hugoabonizio / schedule.cr

:clock3: Run periodic tasks in Crystal

Home Page:https://hugoabonizio.github.io/schedule.cr/Schedule.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Schedule.every Doesn't compensate for time spent executing block

HCLarsen opened this issue · comments

I'm not sure if this is intended behaviour or not, but Schedule.every doesn't compensate for the time it takes to execute the block. For instance, if you have a block that takes 2 milliseconds to run, and you set the interval to 1 second, it doesn't run every second, it runs every 1.002 seconds. This can add up after a couple hundred iterations, meaning the time of your execution is inconsistent. Is this the way you intended it to run?

@HCLarsen I'm actually aware of this limitation, it's not the way I intended it to run and I never had to concern about this difference in my use cases. I agree that this has to be fixed, but I'm afraid that the runner has to be reworked.

You may not have to rework the entire runner. I tried writing a library like this myself before I realized that others had beat me to it. I was able to overcome the limitation(mostly) by calculating the delay from the time the iteration of the loop began. For instance:

next_time = Time.monotonic
loop do
	next_time += interval_value
	block.call
	sleep [next_time - Time.monotonic, Time::Span.zero].max
end

WIth interval_value being the Time::Span passed into the method.

@HCLarsen but this solution may cause desynchronization if the block.call take more than interval_value time, even for once. This might be a temporary solution, but I'd prefer to solve it but running on another fiber.

@hugoabonizio are you talking about running every block.call inside its own fiber?

@HCLarsen something like that, but it's still required to catch the exceptions (inclusing the stop and retry exception). I have an implementation in mind using channels to handle the execution and retry, but didn't have time to work on it yet.

I look forward to seeing it.

Looking at the code, it's basically taking the interval and passing it to a sleep. As there's a lot of needless duplication I'll start with refactoring that into something more reasonable, currently the code is like this:

  def self.every(interval, &block)
    spawn do
      loop do
        sleep interval
        run(block)
      end
    end
  end

  def self.every(interval : Symbol, &block)
    spawn do
      loop do
        sleep calculate_interval(interval)
        run(block)
      end
    end
  end

  def self.every(interval : Symbol, at : String | Array, &block)
    spawn do
      loop do
        sleep calculate_interval(interval, at)
        run(block)
      end
    end
  end

Before anything else I suggest refactoring this into:

  def self.every(interval, &block)
    spawn do
      loop do
        sleep interval
        run(block)
      end
    end
  end

  def self.every(interval : Symbol, &block)
    every(calculate_interval(interval), block)
  end

  def self.every(interval : Symbol, at : String | Array, &block)
    every(calculate_interval(interval, at), block)
  end

Now that the sleep and run only exist in one method we can concentrate on that. I'd prefer to have some way to opt-out of compensating for execution time, as that does incur some processing cost, but in my preliminary tests the change in execution time was insignificant, and it doesn't justify increasing the complexity of my example, so I'll just make it the default. There are a few options like using Time.monotonic etc., but the easiest (And fastest, according to my very few and very small tests) way to implement this is with Time.measure. The result would be something like this:

  def self.every(interval, &block)
    spawn do
      loop do
        duration = Time.measure do
          run(block)
        end
        sleep interval - duration.total_seconds
      end
    end
  end

I'd probably shorten it to:

  def self.every(interval, &block)
    spawn do
      loop do
        sleep interval - Time.measure{ run(block) }.total_seconds
      end
    end
  end

But then again, I'd also change the whole spawn and loop thinggies to reduce nesting and it'd probably become something like:

  def self.every(interval, &block)
    spawn { loop { sleep interval - Time.measure{ run(block) }.total_seconds } }
  end

which, let's face it, is overdoing it. Maybe a good compromise could be:

  def self.every(interval, &block)
    spawn { loop do
      sleep interval - Time.measure{ run(block) }.total_seconds
    end }
  end

or:

  def self.spawnLoop(&block)
    spawn { loop { yield } }
  end
  def self.every(interval, &block)
    spawnLoop do
      sleep interval - Time.measure{ run(block) }.total_seconds
    end
  end

Anyways, Time.measure worked so well for me that I thought it was worth sharing, and as I had come across this shard and this specific issue during my research I thought it'd be nice to share my findings.

In my case using a shard for this would be overkill (I wrote a pretty basic script to monitor some stuff, it only uses about 5Mb of RAM, so yeah, zero dependencies makes a lot of sense in my case), but maybe this can help others.