backfiller

A practice-purposed cli tool to do retroactive recording rule evaluation for Prometheus.

Aims to fix issue prometheus/prometheus#11.

Build

GO111MODULE=on go build -o backfiller main.go

How to use

➜  backfiller -h
usage: backfiller [<flags>] <rule-file> [<db path>] [<dest path>]

Tooling for backfilling Prometheus Recording Rules.

Flags:
  -h, --help                  Show context-sensitive help (also try --help-long and --help-man).
      --version               Show application version.
      --max-samples=50000000  Maximum number of samples a single query can load into memory. Note that queries will fail if they try to load more
                              samples than this into memory, so this also limits the number of samples a query can return.
      --timeout=2m            Maximum time a query may take before being aborted.
      --start=START           Start time (RFC3339 or Unix timestamp).
      --end=END               End time (RFC3339 or Unix timestamp).
      --eval-interval=30s     How frequently to evaluate the recording rules.
      --max-samples-in-mem=10000  
                              maximum number of samples to process in a cycle.
      --query-log-file=""     File to which PromQL queries are logged.
      --log.level=info        Only log messages with the given severity or above. One of: [debug, info, warn, error]
      --log.format=logfmt     Output format of log messages. One of: [logfmt, json]

Args:
  <rule-file>    The rule file for backfilling.
  [<db path>]    tsdb path (default is data/)
  [<dest path>]  path to generate new block (default is data/)

Tutorial

Start Prometheus in the local environment. It is important to add a flag --storage.tsdb.allow-overlapping-blocks to allow overlapping block during tsdb reload.

./prometheus --storage.tsdb.allow-overlapping-blocks

Use tsdbcli to check the metrics in tsdb dir data/. There is no metric name containing test.

./tsdbcli dump data | grep test

Do backfilling based on rule file example.yaml. It will firstly validate the rule file and then loads its recording rules.

The first data arg specifies the tsdb dir to query the past data and the second one specifies the dir to generate the new block.

./backfiller example.yaml data data
level=info msg="replaying WAL, this may take awhile"
level=info msg="WAL segment loaded" segment=0 maxSegment=1
level=info msg="WAL segment loaded" segment=1 maxSegment=1
level=info msg="write block" mint=1576563064320 maxt=1576563859000 ulid=01DW98EQVKD55FCJ0QJV2FTT0P duration=770.543238ms
blockId=data/01DW98EQVKD55FCJ0QJV2FTT0P

Check the metrics in the tsdb dir again.

./tsdbcli dump data | grep test | head
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563069000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563074000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563079000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563084000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563089000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563094000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563099000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563104000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563109000
{__name__="test",instance="localhost:9090",job="prometheus",key="value"} 2 1576563114000

Since Prometheus will automatically merge the blocks in next compaction, we can see the generated data after the compaction, or simply restart Prometheus.

Then we can check the results on the Web UI.

yeya24 / backfiller

backfiller

Build

How to use

Tutorial

About

Languages