ArroyoSystems / arroyo

Distributed stream processing engine in Rust

Home Page:https://arroyo.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Expand support for sql window functions to allow more functions, window definitions, and updating inputs

jacksonrnewhouse opened this issue · comments

Arroyo supports window functions, but only under very specific circumstances:

  • Only ROW_NUMBER() is supported.
  • The input data can't be updating.
  • The first term of the partition clause has to be a window struct, from either hop() or tumble().

The first two constraints are simply a matter of missing implementations. There's no semantic issue with supporting window functions such as LAG or ntile(), just a matter of correctly generating the code.

Similarly, updating inputs just require being able to remove or modify records from state, which is doable.

For the final constraint, it's a little trickier. Many of the window functions, such as ROW_NUMBER(), are sensitive to the total order of the partition. A single record coming in can change the value for an arbitrary number of rows. Supporting this style of update may be asking for trouble. That being said, there are some other time-sensitive windowing functions we could support. For instance, if your ordering is by event time and your window function only needs to know preceding terms, you could compute the streaming result by using the watermarks.

This is a tracking issue just to start the discussion. Individual improvements should reference it as we make progress.

Things like a LAG function is very handy for example to calculate the difference in time between 2 different column values like this:

WITH cte1 AS
  (
    SELECT
        window_start AS start,
        node,
        to_time(lastoccurrence, 'UTC') AS last_occurrence,
        to_time(lastclear, 'UTC') AS last_clear,
        raisetally AS alert_count
      FROM tumble(netcool, to_time(actiontime, 'UTC'), 10s)
      WHERE match(summary, 'is not reachable$')
      AND alert_count > 1
      PARTITION BY node
      GROUP BY start, node, last_occurrence, last_clear, alert_count
      ORDER BY last_occurrence
      SETTINGS seek_to='earliest'
  ),
  cte2 AS
    (
      SELECT
        *,
        lag(last_occurrence) OVER (PARTITION BY node) AS prev_occurrence,
        lag(alert_count) OVER (PARTITION BY node) AS prev_alert_count
      FROM alerts
    )
    SELECT
      window_start AS start,
      node,
      date_diff('second',prev_occurrence, last_clear) AS downtime
    FROM tumble(cte2, ts, 1m)
    WHERE alert_count > prev_alert_count;

┌───────────────────start─┬─node───┬─────────last_occurrence─┬──────────────last_clear─┬─alert_count─┬─────────prev_occurrence─┬─prev_alert_count─┐
│ 2024-03-09 20:40:10.000 │ node01 │ 2024-03-09 20:40:10.0002024-03-09 20:40:06.00042024-03-09 20:40:08.0003 │
│ 2024-03-09 20:40:10.000 │ node01 │ 2024-03-09 20:40:12.0002024-03-09 20:40:10.00052024-03-09 20:40:10.0004 │
│ 2024-03-09 20:40:10.000 │ node01 │ 2024-03-09 20:40:14.0002024-03-09 20:40:12.00062024-03-09 20:40:12.0005 │
│ 2024-03-09 20:40:10.000 │ node01 │ 2024-03-09 20:40:16.0002024-03-09 20:40:12.00072024-03-09 20:40:14.0006 │
│ 2024-03-09 20:40:10.000 │ node01 │ 2024-03-09 20:40:18.0002024-03-09 20:40:13.00082024-03-09 20:40:16.0007 │
└─────────────────────────┴────────┴─────────────────────────┴─────────────────────────┴─────────────┴─────────────────────────┴──────────────────┘