microsoft / Trill

Trill is a single-node query processor for temporal or streaming data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

buffering for flush?

Ohyoukillkenny opened this issue · comments

Question: does Trill buffer the stream for batch flushing?

Trill's flush policy specifies when to flush batched output events through the entire query from the ingress site. For non-partitioned streams, the default policy is "FlushOnPunctuation" that when punctuation is ingressed or generated, a flush will also be propagated through the query.

Moreover, the default onCompleted policy is "EndOfStream" that moves the “current time” to infinity before completing and causes all of the tuples that may be within the system to be pushed through the query plan.

Therefore, given a stream of integers as "1,2,3,4,...,N", suppose Trill's query plan applies an identical mapping to the integers -- ".Select(x=>x)", by default, all integers will be all batched together and flushed when the stream terminates.

To me, it looks like Trill buffers all the input integers at someplace and releases the buffered data when the stream terminates (onCompleted method is called). So when N is large, exp., one billion, the execution of the program shall leverage a huge amount of memory.

However, in experiments, after I profiled the program, I found Trill actually used very small memories. Therefore, I want to ask what is the magic of Trill, and how can Trill "flush" the data batch without buffering all the data.

Code for the experiment is attached below:

public class TrillSimple
{
    public static void Run()
    {
        // disable Trill's columnar optimization
        Config.ForceRowBasedExecution = true;
        // initialize an input stream (Rx.NET observable) that produces
        //     a stream of "long" values, "0,1,2,...,N"
        IObservable<long> rxStream = new InputStream(1000000000);
        // initialize a consumer that consumes the results by updating the last value it receives
        IObserver<StreamEvent<long>> rxConsumer = new LastSaver<StreamEvent<long>>();
        
        // ingress the rx stream into Trill
        //     + add timestamp by CreateInterval: {start: 0, end 1, val 0}, {start: 1, end 2, val 1}, ...
        //     + flush the data batch arrival for the arrival of each element
        IObservableIngressStreamable<long> ingressStream =
            rxStream.Select(x => StreamEvent.CreateInterval(x, x + 1, x))
                .ToStreamable(DisorderPolicy.Drop(), 
                              FlushPolicy.FlushOnPunctuation,
                              OnCompletedPolicy.EndOfStream);
        
        // A type representing the empty struct, similar to Unit in other libraries and languages.
        //     Here, Empty is the grouping key type for the streaming data
        IStreamable<Empty,long> trillFilteredStream =
            ingressStream.Select(evt => evt);
        
        // egress Trill's output into rx
        IObservable<StreamEvent<long>> outputStream = 
            trillFilteredStream.ToStreamEventObservable();
        
        // execution of the query
        outputStream.Subscribe(rxConsumer);
    }
}

public class InputStream : IObservable<long>
{
    private readonly long N;
    public InputStream(long N) {
        this.N = N;
    }

    public IDisposable Subscribe(IObserver<long> observer)
    {
        for (long i = 0; i < this.N; i++) {
            // Console.WriteLine("Input Sends: "+i);
            observer.OnNext(i);
        }
        observer.OnCompleted();
        // since no more input will be sent, the subscription shall be empty
        return Disposable.Empty;
    }
}
commented

Each Trill operator (including ingress) can buffer up to Config.DataBatchSize (configurable, default 80,000) events. When FlushPolicy is not specified or set to None, Trill operators can buffer these events until this batch is full, at which point it will egress that batch to the next operator(s). Of course, operators will only buffer the state necessary, so e.g., if you are aggregating the sum of events over some window, that aggregate state will only be the sum, not all the events contributing to that sum.
FlushPolicy.FlushOnBatchBoundary will buffer events at ingress until that batch fills up, but then flushes that batch throughout the entire query, even if downstream operators decrease the number of events such that the downstream operator buffers are not full.
FlushPolicy.FlushOnPunctuation/FlushOnLowWatermark will flush all buffers in response to the punctuation/low watermark (either explicitly ingressed into Trill or generated by Trill)

Each Trill operator (including ingress) can buffer up to Config.DataBatchSize (configurable, default 80,000) events. When FlushPolicy is not specified or set to None, Trill operators can buffer these events until this batch is full, at which point it will egress that batch to the next operator(s). Of course, operators will only buffer the state necessary, so e.g., if you are aggregating the sum of events over some window, that aggregate state will only be the sum, not all the events contributing to that sum.
FlushPolicy.FlushOnBatchBoundary will buffer events at ingress until that batch fills up, but then flushes that batch throughout the entire query, even if downstream operators decrease the number of events such that the downstream operator buffers are not full.
FlushPolicy.FlushOnPunctuation/FlushOnLowWatermark will flush all buffers in response to the punctuation/low watermark (either explicitly ingressed into Trill or generated by Trill)

Thanks for the reply. After setting Config.DataBatchSize to a large number, I observed the explosion of memory utilization. Thanks again.