microsoft / Trill

Trill is a single-node query processor for temporal or streaming data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Battery usage rate & recharge detection

unruledboy opened this issue · comments

Hi All,

This is an awesome project, I have been observing and learning this project for quite a while, and finally have the perfect candidate for actual usage.

We have a desktop app that sits in lots of clients' machine, observing the battery usage of the laptops.

We know that:

  • Without charging, the battery drains slowly
  • When charging, the battery refills quicker than normal usage

The data come in this structure

  • Id: Guid
  • DeviceId: Guid (to be partitioned on)
  • Timestamp: DateTime
  • Percentage: double (the current battery percentage remaining)

Now, what we need to do are:

  • Determine the usage per minute (this is straight forward, by using the sample from QueryWritingQuide).
  • Determine the recharge cycle (start & finish), not so sure how. What I can think of is, capture the lowest point (in a certain period), and detects when it starts to go up, and find the highest point since then, then workout the difference

The tricky part here are:

  • every usage -> recharge has its own cycle, and how to achieve above two things without overlapping (false detection). So it's hard to put them together without interfering each other
  • multiple laptops come in at the same time, so we need to partition them, I believe GroupApply can be used for this

The picture below illustrates the typical usage of the battery of ONE laptop:

  • The black line indicates the usage
  • The red line indicates recharge

battery

commented

@unruledboy For the partitioning aspect, you could use the Grouping functions as you suggest, but this has limitations, e.g., all devices must abide by the same singular timeline, such that all events must be ordered even across devices. Alternatively, you can use a partitioned query, where each ingressed event is a PartitionedStreamEvent, partitioned by the device Guid. Trill will then execute the query on each device independently, and each device may be on its own timeline, so as long as each device's sub-stream is ordered, things will still work as expected.

For detecting the cycles, it depends a bit on how you want to consume your output, since the two pieces of information have very different formats. You could just have two output streams based on the same (multicast) input stream - one for the recharge cycles and one for the usage/min. E.g., to output a single event on every transition (charging -> draining or draining -> charging), you could set up a simple state machine using our Regex/Afa APIs, that track the current cycle's start time/battery level in the accumulation register.

@peterfreiling You are right about the two different biz logics. I was thinking may I could have use two different queries for corresponding logic, but then, it would have doubling the resources and potentially half of the performance, right?

I think the most challenging part is how to "track" the state transition, without false positive detection because of overlapping cycles.

commented

I'm not sure what you mean by "overlapping cycles". Each device should either be charging or draining at any given point, right? E.g., here's a quick and dirty sample that uses the pattern matching APIs to set up a simple state machine to track the battery cycle information, using BatteryCycleInfo as the register to accumulate state, and egresses each cycle information (starting/ending battery, start/end time) as soon as the device switches from charging->draining or draining->charging.

class BatteryCycleInfo : IEquatable<BatteryCycleInfo>
{
    public bool? Charging = null; // true: charging, false: draining, null: uninitialized

    public (long timestamp, int batteryPercent) CycleStart = (-1, -1);
    public (long timestamp, int batteryPercent) CycleEnd = (-1, -1);
    public (long timestamp, int batteryPercent) LastSample = (-1, -1);

    private bool egressed = false;

    public BatteryCycleInfo()
    {
    }

    public BatteryCycleInfo(bool charging, (long time, int battery) cycleStart, (long time, int battery) cycleEnd)
    {
        this.Charging = charging;
        this.CycleStart = cycleStart;
        this.CycleEnd = cycleEnd;
    }

    public bool HasSample => this.LastSample.batteryPercent != -1;

    // If this cycle has just egressed, flip the charging status
    public bool? IsCurrentlyCharging => this.egressed ? !this.Charging.Value : this.Charging.Value;

    public BatteryCycleInfo UpdateBatteryPercent(long timestamp, int batteryPercent)
    {
        if (this.egressed)
        {
            // This instance has been egressed from previous cycle, so return a new instance, flipping the Charging status
            return new BatteryCycleInfo()
            {
                Charging = !this.Charging.Value,
                CycleStart = this.CycleEnd,
                LastSample = (timestamp, batteryPercent),
            };
        }
        else
        {
            this.LastSample = (timestamp, batteryPercent);
            return this;
        }
    }

    public BatteryCycleInfo InitializeCycle(long timestamp, int batteryPercent)
    {
        this.Charging = (this.LastSample.batteryPercent < batteryPercent);
        this.CycleStart = this.LastSample;
        this.LastSample = (timestamp, batteryPercent);
        return this;
    }

    public BatteryCycleInfo Egress(long timestamp, int batteryPercent)
    {
        if (this.egressed)
        {
            // This instance has been egressed from previous cycle, and this new cycle was a single-sample cycle,
            // so return a new instance, flipping the Charging status
            return new BatteryCycleInfo()
            {
                Charging = !this.Charging.Value,
                CycleStart = this.CycleEnd,
                CycleEnd = this.LastSample,
                LastSample = (timestamp, batteryPercent),
                egressed = true,
            };
        }
        else
        {
            this.CycleEnd = this.LastSample;
            this.LastSample = (timestamp, batteryPercent);
            this.egressed = true;
            return this;
        }
    }

    public bool Equals(BatteryCycleInfo other) => this.Charging == other.Charging && this.CycleStart == other.CycleStart && this.CycleEnd == other.CycleEnd;
}
private enum DeviceBatteryState
{
    Undetermined = 0,
    Changing = 2,
    Output = 3,
}

[TestMethod, TestCategory("Gated")]
public void BatteryUsage()
{
    // State machine:
    //  0: not initialized
    //  1: battery either charging or draining
    //  2: Transitiong between charging->draining or darining->charging. This is a final state so an output is produced, but will also transition back to 1
    // For all state transitions, Register state is simply the previous battery percentage
    var pattern = Afa.Create<int, BatteryCycleInfo>();

    // Undetermined->Undetermined:
    // First sample we don't know if it's charging or draining. Stay on this state until the battery % changes.
    pattern.AddSingleElementArc(
        fromState: (int)DeviceBatteryState.Undetermined,
        toState: (int)DeviceBatteryState.Undetermined,
        fence: (timestamp, newBatteryPercent, register) => register == null || !register.HasSample || register.LastSample.batteryPercent == newBatteryPercent,
        (timestamp, newBatteryPercentage, register) => (register ?? new BatteryCycleInfo()).UpdateBatteryPercent(timestamp, newBatteryPercentage));

    // Uninitialized->Changing
    pattern.AddSingleElementArc(
        fromState: (int)DeviceBatteryState.Undetermined,
        toState: (int)DeviceBatteryState.Changing,
        fence: (timestamp, newBatteryPercent, register) => register != null && register.HasSample && register.LastSample.batteryPercent != newBatteryPercent,
        (timestamp, newBatteryPercentage, register) => register.InitializeCycle(timestamp, newBatteryPercentage));

    // Changing->Changing
    pattern.AddSingleElementArc(
        fromState: (int)DeviceBatteryState.Changing,
        toState: (int)DeviceBatteryState.Changing,
        fence: (timestamp, newBatteryPercent, register) =>
            register.IsCurrentlyCharging == true
                ? register.LastSample.batteryPercent <= newBatteryPercent // Charging -> Charging
                : register.LastSample.batteryPercent >= newBatteryPercent, // Draining -> Draining
        (timestamp, newBatteryPercentage, register) => register.UpdateBatteryPercent(timestamp, newBatteryPercentage));

    // End of current cycle, first transition to Output to egress the change, then transition back to Changing
    pattern.AddSingleElementArc(
        fromState: (int)DeviceBatteryState.Changing,
        toState: (int)DeviceBatteryState.Output,
        fence: (timestamp, newBatteryPercent, register) =>
            register.IsCurrentlyCharging == true
                ? register.LastSample.batteryPercent > newBatteryPercent // Charging -> Draining
                : register.LastSample.batteryPercent < newBatteryPercent, // Draining -> Charging
        (timestamp, newBatteryPercentage, register) => register.Egress(timestamp, newBatteryPercentage));

    pattern.AddFinalState((int)DeviceBatteryState.Output);
    pattern.AddEpsilonElementArc((int)DeviceBatteryState.Output, (int)DeviceBatteryState.Changing);

    // Since our state machine can only have one arc active at any given point (e.g., a device cannot be both charging and draining),
    // our pattern is deterministic, and we can use a single pattern matcher for the duration of the input stream.
    // const bool patternIsDeterministic = true;
    const bool allowOverlappingInstances = false;

    var qc = new QueryContainer();
    var input = new Subject<StreamEvent<int /*battery %*/>>();
    var ingress = qc.RegisterInput(input, DisorderPolicy.Drop(reorderLatency: 500));

    var query = ingress
        .Detect(pattern, maxDuration: 100, allowOverlappingInstances);

    var output = new List<BatteryCycleInfo>();
    qc.RegisterOutput(query)
        .Where(e => e.IsData)
        .ForEachAsync(e => output.Add(e.Payload));

    var process = qc.Restore();

    // Undetermined
    input.OnNext(StreamEvent.CreateStart(100, 50));
    input.OnNext(StreamEvent.CreateStart(101, 50));
    input.OnNext(StreamEvent.CreateStart(102, 50));

    // -> Charging
    input.OnNext(StreamEvent.CreateStart(103, 55));
    input.OnNext(StreamEvent.CreateStart(104, 60));
    input.OnNext(StreamEvent.CreateStart(105, 65));
    input.OnNext(StreamEvent.CreateStart(106, 65));
    input.OnNext(StreamEvent.CreateStart(107, 70));

    // -> Draining
    input.OnNext(StreamEvent.CreateStart(108, 69));
    input.OnNext(StreamEvent.CreateStart(109, 68));
    input.OnNext(StreamEvent.CreateStart(110, 67));

    // -> Charging
    input.OnNext(StreamEvent.CreateStart(111, 72));

    input.OnCompleted();

    var expected = new BatteryCycleInfo[]
    {
        new BatteryCycleInfo(charging: true, cycleStart: (102, 50), cycleEnd: (107, 70)),
        new BatteryCycleInfo(charging: false, cycleStart: (107, 70), cycleEnd: (110, 67)),
    };
    Assert.IsTrue(expected.SequenceEqual(output));
}

@peterfreiling this sample works like a treat! great thanks for the detailed explanation!

one last thing, the output won't flush until we call input.OnCompleted(), I could not just call that for an incoming stream. I guess what I tried with flush policy via RegisterInput or manipulating the maxDuration did not work means it is not the right way.

I also read through all the tests for similar usage but could not find the answer.

commented

The OnCompleted call was just for test purposes. You have many other options to control the Flush behavior:

  • FlushPolicy
  • Calling Process.Flush (Process is returned by QueryContainer.Restore)
  • Don't manually flush, and instead wait for the buffers to fill up to the buffer size before egressing to the next operator

@peterfreiling I investigated the FlushPolicy before I asked above. I could have created an periodic punctuation (like every 5 minutes) and experimented it to verify if it will behave as I expect.

And I just did the test, seems it will correctly process the events and won't lose the state.

For the buffer size, do you mean maxDuration (currently 100) ? I tested with hundreds of inputs but the result did not output.

BTW, I noticed if I don't use the artificial timestamp values (100, 101, 102), but instead using (DateTime.Now + TimeSpan.FromSeconds(n)).Ticks (n is 0, 1, 2, 3 etc for test purpose), your original test case won't output result.

Thanks,
Wilson

commented

By buffer size, i was referring to Config.DataBatchSize, not maxDuration. If no FlushPolicy.None is specified, then each Trill operator will not flush its output batch until it is full with DataBatchSize events.
If you use a real DateTime be sure to set maxDuration appropriately (currently it is set to 100 ticks, which is unrealistically small for actual time).

Now that I get it working by periodic FlushPolicy, really appreciate all the detailed explanation and patient help.

In our biz, we don't simply capture and work on the battery percentage, we also process heaps of other types of data. So even the incoming data structure is same, we will still have to create dedicated ingress/egress stream processors due to different pattern and backed by different state machines.

I was wondering what is the best practice for this scenarios, as the biz needs to process hundreds if not thousands of type of data streams with tens of thousands of devices each, to produce corresponding output events. Currently I just fire new a new Task and hook up with the output with downstream handling logic (sending notifications/emails etc.) per pattern. Is there a guide of this parallel processing for Trill?

I also noticed that Trill is a single node solution, which I can partition the data per group of devices per node, I just want to know what is the recommendation.

BTW, the default Config.DataBatchSize is 80K, which is big, and it is singleton, which means all processors (queries) will be sharing the same setting, in our case, different biz would have different buffer size, so changing it in one app instance would not be ideal, I believe?

Thanks, great work!

Yours,
Wilson

commented

For parallelization, it is generally left to the consumer. I think ideally you would partition the input source, e.g. by device id, and have separate (possibly otherwise identical) parallel Trill pipelines processing each partition on a separate thread/process. If actual input source cannot be partitioned, you can have a dedicated Trill pipeline to perform that partitioning.

Since each Trill pipeline will have multiple output sources/formats, you can use Multicast so that each query fork shares as much as possible with the other forks. E.g., all forks are likely able to share the ingressed input, so after ingress, you can call Multicast to send that single input stream to multiple output streams.

For the Config.DataBatchSize, it is unfortunately currently configured as a global static, so all Trill pipelines in a given process will share the same batch size. You can work around this by isolating Trill instances to different processes for different batch sizes.

@peterfreiling thank you very much for all the detailed explanation and the great working sample. It's running quite solid so far in our development environment without partition.

However, based on your test case code, if I use PartitionStreamEvent: Subject<PartitionedStreamEvent<Guid, IngestValue>>

And use punctuation:

var ingress = qc.RegisterInput(Input, DisorderPolicy.Drop(reorderLatency: 500), PartitionedFlushPolicy.FlushOnBatchBoundary /* does not work with PartitionedFlushPolicy.FlushOnLowWatermark as well */, PeriodicPunctuationPolicy.Time((ulong)TimeSpan.FromSeconds(120).Ticks));
There is no egress event triggered at all, despite I ingested over 100 values spanning over 5 minutes.

I read through all the sample codes, there is only one that uses the partitioned stream which is the rule engine one, but it does not use any punctuation to flush, and it is not using PartitionedStreamEvent per say.

And I read through all the issues (including the closed ones), there are a few related to the partition, but no one mentioned about that actual usage of the flush for partitioned stream.

commented

Thanks @unruledboy , there is a bug in our partitioned pattern matching code where matches aren't detected properly. @rodrigoaatmicrosoft has a fix and will send a PR for the issue soon.

@peterfreiling that's great news to have a fix for it, I will certainly test it out once it is available.

Also, as you mentioned, if the incoming events are by streamable and we can use multicast if the ingested values are sharable between biz logics. But implementation like what we have discussed and as in your sample, we use a pub/sub like mechanism, I believe we can't use MultiCast.

commented

The sample provided is compatible with multicast, but it depends on your query logic, and how much of the query pipeline is shared between the different forks. From the IStreamable that you want to share between the multiple subqueries, you can call Multicast to get an array of IStreamables to be used in the forks. For example, if you wanted to fork immediately after ingress for two subqueries, one to detect cycles and another to track the average battery percentage, you could do something like:

            var subQueries = ingress.Multicast(2);

            var cycleQuery = subQueries[0]
                .Detect(pattern, maxDuration: 100, allowOverlappingInstances);

            var output = new List<BatteryCycleInfo>();
            qc.RegisterOutput(cycleQuery)
                .Where(e => e.IsData)
                .ForEachAsync(e => output.Add(e.Payload));

            var averageQuery = subQueries[1]
                .TumblingWindowLifetime(tumbleDuration: 5)
                .Average(batteryPercent => batteryPercent);

            var averagesOutput = new List<double>();
            qc.RegisterOutput(secondQuery)
                .Where(e => e.IsData)
                .ForEachAsync(e => averagesOutput.Add(e.Payload));

            var process = qc.Restore();

@peterfreiling aha, now I got it how to do multi cast with this pattern. You are right, this will be more ideal to deal with multiple logics sharing the same set of data.

@peterfreiling for the multi cast, I end up like this for min/max/avg values per biz logic per minute per device. The beautify of using Trill is I can work out the aggregation values in real-time without using traditional periodic scheduling logic to run at the background.

var queries= ingress.Multicast(4);
............
............
var minDetection = queries[1].TumblingWindowLifetime(tumbleDuration: TimeSpan.TicksPerMinute)
	.GroupApply(
	e => e.DeviceId,
	s => s.Min(e => e.Value),
	(g, p) => new { g.Key, Value = p });

qc.RegisterOutput(minDetection)
	.Where(e => e.IsData)
	.ForEachAsync(e => onValueTrigger(DeviceId: e.Payload.Key, Value: e.Payload.Value, Type: AggregationType.Min));

var maxDetection = queries[2].TumblingWindowLifetime(tumbleDuration: TimeSpan.TicksPerMinute)
	.GroupApply(
	e => e.DeviceId,
	s => s.Max(e => e.Value),
	(g, p) => new { g.Key, Value = p });

qc.RegisterOutput(maxDetection)
	.Where(e => e.IsData)
	.ForEachAsync(e => onValueTrigger(DeviceId: e.Payload.Key, Value: e.Payload.Value, Type:AggregationType.Max));

var avgDetection = queries[3].TumblingWindowLifetime(tumbleDuration: TimeSpan.TicksPerMinute)
	.GroupApply(
	e => e.DeviceId,
	s => s.Average(e => e.Value),
	(g, p) => new { g.Key, Value = p });

qc.RegisterOutput(avgDetection)
	.Where(e => e.IsData)
	.ForEachAsync(e => onValueTrigger(DeviceId: e.Payload.Key, Value: e.Payload.Value, Type: AggregationType.Avg));