microsoft / Trill

Trill is a single-node query processor for temporal or streaming data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

QueryContainer.Flush?

nsulikowski opened this issue · comments

When would someone use that method vs say managing the Flushes with say Punctuations ?

Hi @nsulikowski ,

Punctuations are primarily meant to move time forward for the stream, whereas flushes output all possible output events as possible, yielding a lower latency for those events at the potential cost of lower throughput.

While the default behavior is to flush on punctuations/low watermarks (FlushPolicy.FlushOnPunctuation or PartitionedFlushPolicy.FlushOnLowWatermark), some consumers require more control over these two concepts. When another FlushPolicy is specified, the consumer would call Flush explicitly when a flush is desired (though this is optional).

E.g., a consumer may want very frequent punctuations to keep trill state to a minimum (allowing it to clean up any state associated with time before the punctuation), but may want to flush less frequently for greater throughput (vs. the lower latency with more frequent flushes). Or, a consumer may want to issue low watermarks to ensure that the time difference between a stream's partitions is kept to a reasonable limit, but may not want to sacrifice throughput to do so. Or, a consumer may want to flush very frequently to ensure minimal latency, but may not want to artificially move time forward to do so.