Virtual fields are included in source struct
mwylde opened this issue · comments
When creating a table with virtual fields (for example ,to generate a computed timestamp or watermark field), the source struct should only contain the physical fields, as this is used to deserialize data off of the source, then the virtual fields should be added in a post-source filter step.
However (possibly as a consequence of the changes in #184) virtual fields are being included in the source struct.
For example:
create table stream (
timestamp BIGINT NOT NULL,
event_time TIMESTAMP NOT NULL GENERATED ALWAYS AS (CAST(from_unixtime(timestamp * 1000000000) as TIMESTAMP))
) WITH (
connector = 'kafka',
bootstrap_servers = 'localhost:9092',
topic = 'stream',
format = 'json',
type = 'source',
event_time_field = 'event_time'
);
select * from stream;
Creates a pipeline with this source struct:
pub struct generated_struct_5280044764730053267 {
pub timestamp: i64,
#[serde(with = "arroyo_worker::formats::timestamp_as_rfc3339")]
pub event_time: std::time::SystemTime,
}
But because the virtual field is non-nullable, this causes deserialization errors when the events (correctly) not include the compute event_time
field.