microsoft / Kusto-Query-Language

Kusto Query Language is a simple and productive language for querying Big Data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow make-series to fill empty values with the previous known value

fgheysels opened this issue · comments

In KQL, there are multiple aggregation methods available like series_fill_forward, series_fill_backward, etc... These functions work on a series, so in most of the cases, you need to define the series first via make-series.
make-series allows you to fill in the gaps via the default argument. As far as I know, the value that is passed as 'default' must be a const.

It would be nice if we could combine the 'interpolation' with the make-series function, so that it is allowed to do something like this:

make-series myseries = max(value) default=fill_previous on timestamp .... 
commented

Thanks for the suggestion - it was added as a feature request to our backlog. It is marked as low priority because it is an ease of use request (it can be achieved by using a make-series followed by a series functions)

Thanks.

Although, it is not very clear to me how I can easily retrieve a series of metrics, for which I sample down the values that are returned, to give me for instance the last known value per 15 minute interval, and if I don't have any sample for a specific 15-minute interval, the value of the previous interval can be used.

I came up with something like this:

Table
| where timestamp  > ago(5d) 
| where tag == "sometag" 
| summarize arg_max(timestamp, value_decimal) by bin(timestamp, 15m) 
| make-series sometag_series = max(value_decimal) default=long(null) on timestamp1 step 15m 
| project spec = series_fill_forward(sometag_series), timestamp1 
| render linechart with ( ycolumns=spec,  xcolumn=timestamp1) 

As you can see, I first use the arg_max function to get the LKV per interval of 15 minutes, but afterwards, in the make-series statement, I need to provide another aggregation method. I think I can use min, max, or even any here since I should normally have only one value per interval, but it doesn't make the statement very readable imho. (Unless there are other ways to do this ? )

commented

I think the summarize is redundant: make-series will take the maximal value in each 15m bin (step) with default null, the consequent series_fill_forward will fill the empty bins with the last known value. I think the following query (similar to your query, omitting the arg_max summarize) is quite readable:

Table
| where timestamp  > ago(5d) 
| where tag == "sometag" 
| make-series sometag_series = max(value) default=double(null) on timestamp step 15m 
| project spec = series_fill_forward(sometag_series), timestamp 
| render linechart with ( ycolumns=spec,  xcolumn=timestamp)

I don't think it is similar, as I use arg_max on timestamp in the summarize statement to get the last known value per interval.
If you omit this, you get -afaik- the maximum value per interval, but the maximum value is not necesserily the last known value.

commented

Right - I missed that you want last in each bin and not max. Then the ask is for make-series to support arg_max() aggregation? (currently arg_max is not supported in make-series) so you could do something like the following:

Table
| where timestamp  > ago(5d) 
| where tag == "sometag" 
| make-series (ts,sometag_series) = arg_max(timestamp, value) default=double(null) on timestamp step 15m 
| project spec = series_fill_forward(sometag_series), timestamp 
| render linechart with ( ycolumns=spec,  xcolumn=timestamp)

Ideally, this should be possible:

make-series (ts, sometag_value) = arg_max(timestamp, value) default = fill_previous on .... 

arg_max in make-series and fill_previous functionality for the default keyword.

commented

Ack. Both have been added as feature requests to our backlog - they are both ease of use features (tagged with low priority). Thanks for the feedback

I'm closing this as it is not an open issue specifically for this library.

Why not ? This repository is called 'Kusto-Query-Language'.
this is a feature request forthe Kusto query language.

This github repository is for a parser. The change you've requested requires change of the Kusto service - therefore it's not the right place to track it. Please use https://aka.ms/adx.uservoice to suggest and upvote the idea.