influxdata / community-templates

InfluxDB Community Templates: Quickly collect & analyze time series data from a range of sources: Kubernetes, MySQL, Postgres, AWS, Nginx, Jenkins, and more.

Home Page:https://www.influxdata.com/products/influxdb-templates/gallery/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Downsampling Task that works on all data types without hardcoded filters

R-Studio opened this issue · comments

I know the downsampling tasks examples, but I don't won't to change the downsampling tasks and contained hardcoded filters everytime we adding new data to our InfluxDB. More Information below:

What I want
I want to downsample all of my data (with mean) from my raw bucket "telegraf" (frequency = 30s, retention 30 days) like this:

  • telegraf -> telegraf_90d (frequency = 1h, retention 90 days)
  • telegraf_90d -> telegraf_365d (frequency = 12h, retention 365 days)
    ..

What I get / Error
Unfortunately I get following error:
could not execute task run: unsupported input type for mean aggregate: string

What is the cause?
I have some collectors like NetApp Harvest that unfortunately writes sometimes strings or booleans in "_value" and that's why I get the error above.

What I want to prevent / What is my goal

  • I want a working downsampling task that works with all supported data types without specifying them hardcoded.
  • I don't want to exclude all the _measurements or _fields that have strings in "_value".

Acceptable Workaround (when we don't find any solution)
Exclude all other data types that not contains a numeric value in "_value" from downsampling.
I thought as a QuickFix exclude all non numeric data with regex, but this won't work and I don't get any help (influxdata/flux#3804)

-> But I would be very happy If we find a solution for my issue and not a workaround.

My downsampling task (one of them)

option task = {name: "task_telegraf_90d", every: 1h}

data = from(bucket: "telegraf")
	|> range(start: -duration(v: int(v: task.every) * 2))
	|> filter(fn: (r) =>
		(r._measurement =~ /.*/))

data
	|> aggregateWindow(fn: mean, every: 1h)
	|> filter(fn: (r) =>
		(exists r._value))
	|> to(bucket: "telegraf_90d", org: "MYORG")

Additional Informations
InfluxDB: Version 2.0.5
VM: 8 vCores & 128GB Memory
I also write the same to the InfluxData Community.

This is extremely important for us! I am happy about any help.

yea, Influx's story around rolling up data has always left a lot to be desired IMHO