apache / pinot

Apache Pinot - A realtime distributed OLAP datastore

Home Page:https://pinot.apache.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Row level TTLs in Pinot

rohityadav1993 opened this issue · comments

The requirement arises from usecases that need to filter out rows during query that are older than a provided timestamp representing TTL.
This can be achieved currently by caller side implementations:

  1. Create a row_ttl column (time) where the value is provided by upstream.
  2. For every query, decorate it with the additional filter clause e.g.: select count(*) where row_ttl > current_timestamp

The above can be provided as a native feature in Pinot for ease of adopting.

The SLA for filtering out TTLed rows is strict hence any minion based approach can not be applied.

This requirement seems a little bit strange. Let's say at the ingestion time the timestamp is within TTL, but before the segment is expired, the timestamp could be outside of TTL but still queryable. I don't think we can enforce TTL strictly during ingestion.

@Jackie-Jiang , I might not have phrased this clearly, updating the description. The filtering is needed on the query side but the TTL is on the timestamp that is associated with a row's column(provided by upstream).

I see. Does it work if we build a feature to automatically add a filter: where timeCol > currentTime - TTL where TTL is configurable? Time column is already configured for most of the tables, and it is used to manage retention

commented

This is more of an expiry time and TTL duration is not provided. E.g. use case: a bid expires on 1 Jan 2024 00:00:00. This time column is independent of event time which is used for retention management. So a filter like this will be useful where rowExpiryTime > currentTime

I see. So basically you want to apply TTL on any date time field, where Pinot can automatically apply a filter?

commented

Yes, precisely this. I think any other solution where we try to delete the row at expiry time would be too complex to implement so having a filter and then either from minion compaction or externally invoking delete can be a simpler solution.