To extract partition fields from a timestamp
shift-alt-del opened this issue · comments
Wei.D commented
Hi, I'm now working on a PoC to sink logs from Kafka to Iceberg format, I want to partition the logs to under year=YYYY/month=MM/day=DD
, but I only have a timestamp inside the log.
I didn't found any configurations on how to partition logs with timestamp, so wondering if there any workarounds existing already?
I think there is an workaround to use SMT to duplicate the ts_ms
into year
, month
, day
, then extract data into 3 different fields and set to iceberg.tables.default-partition-by
, however it makes the connector config dirty yet requires to code a custom SMT function...
For a detailed example, my log format is like
{
"ts_ms": 1588252618953,
"data": "abcd"
}
Thanks.
Wei.D commented