dcdourado / ecto_backfiller

A back-pressured backfill executor for Ecto

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Offset usage

dcdourado opened this issue · comments

If the query results are affected by the execution of the backfill's module handle_batch/1 callback, moving the offset would cause some rows to be skipped.

Example:

Query -> users where email_verified = true
Backfill execution -> updates email_verified = false

The incremental offset strategy is not useful the way it is implemented right now for this use case.

We could move the offset to 0 after handle_batch executes, but if there is more than one consumer this could cause rows to be executed twice, forcing the implementation to be idempotent and the whole operation would cost more.

Any suggestions?

Maybe the backfiller should only receive the schema struct and then query the whole table sorting by inserted_at ASC, with no custom filters.