ArroyoSystems / arroyo

Distributed stream processing engine in Rust

Home Page:https://arroyo.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SQL compilation failure when using non-nullable timestamp as event_time_field or watermark_field

mwylde opened this issue · comments

This query fails to compile:

create table demo_stream (
  timestamp BIGINT NOT NULL,
  event_time TIMESTAMP GENERATED ALWAYS AS (CAST(from_unixtime(timestamp * 1000000000) as TIMESTAMP))
) WITH (
  connector = 'kafka',
  bootstrap_servers = 'localhost:9092',
  topic = 'demo-stream',
  format = 'json',
  type = 'source',
  event_time_field = 'event_time'
);

select * from demo_stream;

When setting the even_time_field and watermark_field, we currently assume those timestamps are nullable:

error[E0599]: no method named `expect` found for struct `SystemTime` in the current scope
   --> pipeline/src/main.rs:141:46
    |
138 |   ...                   timestamp: arg
    |  __________________________________-
139 | | ...                       .event_time
140 | | ...                       .clone()
141 | | ...                       .expect("require a non-null timestamp"),
    | |                           -^^^^^^ method not found in `SystemTime`
    | |___________________________|
    |

error[E0599]: no method named `unwrap_or_else` found for struct `SystemTime` in the current scope
   --> pipeline/src/main.rs:175:42
    |
173 | / ...                   arg.watermark
174 | | ...                       .clone()
175 | | ...                       .unwrap_or_else(|| std::time::SystemTime::now())
    | |                           -^^^^^^^^^^^^^^ method not found in `SystemTime`
    | |___________________________|

This is caused because of a mismatch in the data type of the watermark field according to the CREATE TABLE signature event_time TIMESTAMP and the expression to compute it CAST(from_unixtime(timestamp * 1000000000) as TIMESTAMP). In particular, the event_time TIMESTAMP doesn't imply a not-null constraint, but the expression is not nullable. This results in a StructField whose type is not-null but the generated expression is nullable, and thus an Option. We could throw an error for it at

let expr = expression_context.compile_expr(&df_expr)?;
. We could also add a CAST to the declared type in the case of a mismatch.