mainmatter / breethe-server

Air Quality Data for Locations around the World

Home Page:https://breethe.app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update measurement data every night or so

marcoow opened this issue · comments

We should have a simple mix task that updates the data for all the locations we have in the database every night or so so that the data is always fresh (enough) and can be returned right away without loading it synchronously.

If we have a mix task for this we can run that with heroku's scheduler feature.

Just putting this here for future reference: https://devcenter.heroku.com/articles/scheduler

GenServer is way more complex though as the mix task could simply look like this, reusing what we already have:

Location
  |> Repo.all
  |> Enum.each(fn location ->
        Airquality.Sources.OpenAQ.get_latest(location.id)
      end)

True, but we'd be depending on heroku to make the code run. What if we move to a different server?
IMHO for languages which don't have Genserver like features built in, it would make sense; but I think we should make use of Elixir's built in functionality (this is actually the type of thing the Erlang VM was designed for).

Also a Genserver isn't actually that complex. We'd need this:

  def start_link() do
    GenServer.start_link(__MODULE__, :ok, name: __MODULE__)
  end

  def init(:ok) do
    Process.send_after(self(), :work, @interval)
    {:ok, %{last_sent: nil}}
  end

  def handle_info(:work, _state) do
    Location
    |> Repo.all
    |> Enum.each(fn location ->
        Airquality.Sources.OpenAQ.get_latest(location.id)
      end)     

    Process.send_after(self(), :work, @interval)
    {:noreply, %{last_run_at: :calendar.local_time()}}
  end

Also with Genserver I'm pretty sure we could effectively spawn a process for each location, making use of parallelism to speed up the process. (as long as OpenAQ doesn't rate limit us too much)
The Genserver would also be supervised - restarting processes in the event of failure would be a breeze.

This is, of course, just my 2 cents. If you'd prefer to use the heroku scheduler and a mix task, then let's do that 🙂

heroku scheduler is basically just a different name for cron which is available everywhere.

Also with Genserver I'm pretty sure we could effectively spawn a process for each location, making use of parallelism to speed up the process.

This is something you'd usually try to avoid when interacting with external APIs as you might run into request limits.

To me using a Genserver here looks like using that only because it's available but not because it provides any real benefits in this case…