Downsampling Task that works on all data types without hardcoded filters

Question

Downsampling Task that works on all data types without hardcoded filters

R-Studio opened this issue 2 years ago · comments

I know the downsampling tasks examples, but I don't won't to change the downsampling tasks and contained hardcoded filters everytime we adding new data to our InfluxDB. More Information below:

What I want
I want to downsample all of my data (with mean) from my raw bucket "telegraf" (frequency = 30s, retention 30 days) like this:

telegraf -> telegraf_90d (frequency = 1h, retention 90 days)
telegraf_90d -> telegraf_365d (frequency = 12h, retention 365 days)
..

What I get / Error
Unfortunately I get following error:
could not execute task run: unsupported input type for mean aggregate: string

What is the cause?
I have some collectors like NetApp Harvest that unfortunately writes sometimes strings or booleans in "_value" and that's why I get the error above.

What I want to prevent / What is my goal

I want a working downsampling task that works with all supported data types without specifying them hardcoded.
I don't want to exclude all the _measurements or _fields that have strings in "_value".

Acceptable Workaround (when we don't find any solution)
Exclude all other data types that not contains a numeric value in "_value" from downsampling.
I thought as a QuickFix exclude all non numeric data with regex, but this won't work and I don't get any help (influxdata/flux#3804)

-> But I would be very happy If we find a solution for my issue and not a workaround.

My downsampling task (one of them)

option task = {name: "task_telegraf_90d", every: 1h}

data = from(bucket: "telegraf")
	|> range(start: -duration(v: int(v: task.every) * 2))
	|> filter(fn: (r) =>
		(r._measurement =~ /.*/))

data
	|> aggregateWindow(fn: mean, every: 1h)
	|> filter(fn: (r) =>
		(exists r._value))
	|> to(bucket: "telegraf_90d", org: "MYORG")

Additional Informations
InfluxDB: Version 2.0.5
VM: 8 vCores & 128GB Memory
I also write the same to the InfluxData Community.

This is extremely important for us! I am happy about any help.

John Dyer · Answer 1 · Tue Dec 20 2022 03:34:05 GMT+0800 (China Standard Time)

yea, Influx's story around rolling up data has always left a lot to be desired IMHO