grafana / grafana

The open and composable observability and data visualization platform. Visualize metrics, logs, and traces from multiple sources like Prometheus, Loki, Elasticsearch, InfluxDB, Postgres and many more.

Home Page:https://grafana.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prometheus: label-names with dots rejected in Grafana v8.3.0

bad3bs opened this issue · comments

Grafana version 8.3.0
Prometheus data source point to Victoria Metrics database (MetricsQL support dots in label name)

Request from database metric with label containing dot in name causes error:
...Metric: unmarshalerDecoder: "label.name" is not a valid label name, error found...

In 8.2.5 all works fine.

Thanks for creating this issue! We think it's missing some basic information.

Follow the issue template and add additional information that will help us replicate the problem.
For data visualization issues:

  • Query results from the inspect drawer (data tab & query inspector)
  • Panel settings can be extracted in the panel inspect drawer JSON tab

For dashboard related issues:

  • Dashboard JSON can be found in the dashboard settings JSON model view

For authentication, provisioning and alerting issues, Grafana server logs are useful.

Happy graphing!

8.2.5 all works works:

8 2 5

8.3.0 same source, same query:

8 3 0

query:

{
  "request": {
    "url": "api/ds/query",
    "method": "POST",
    "data": {
      "queries": [
        {
          "refId": "A",
          "exemplar": false,
          "expr": "metric",
          "key": "Q-aa120463-53c9-40de-9e46-b700e4b441bd-0",
          "datasource": {
            "uid": "YaiZB7knz",
            "type": "prometheus"
          },
          "queryType": "timeSeriesQuery",
          "requestId": "Q-aa120463-53c9-40de-9e46-b700e4b441bd-0A",
          "utcOffsetSec": 28800,
          "legendFormat": "",
          "datasourceId": 6,
          "intervalMs": 15000,
          "maxDataPoints": 1872
        }
      ],
      "range": {
        "from": "2021-12-02T23:04:23.613Z",
        "to": "2021-12-02T23:34:23.613Z",
        "raw": {
          "from": "now-30m",
          "to": "now"
        }
      },
      "from": "1638486263613",
      "to": "1638488063613"
    },
    "hideFromInspector": false
  },
  "response": {
    "results": {
      "A": {
        "error": "unmarshalerDecoder: model.Matrix: model.SampleStream.Values: []model.SamplePair: Metric: unmarshalerDecoder: \"label.dot\" is not a valid label name, error found in #10 byte of ...|label.dot\":\"dev\"},\"v|..., bigger context ...|[{\"metric\":{\"__name__\":\"metric\",\"label.dot\":\"dev\"},\"values\":[[1638487905,\"1234\"],[1638487920,|..., error found in #10 byte of ...|1234\"]]}]}|..., bigger context ...|1234\"],[1638488040,\"1234\"],[1638488055,\"1234\"]]}]}|...",
        "refId": "A"
      }
    }
  }
}

We have stumble upon the same issue in our preproduction environment.
Our stack (based on warp10 and translation proxies) also allow dots in labels and metric names.

Same problem in 8.3.1

Same problem 8.3.2

Same in 8.3.3

commented

This might be related to the Prometheus backend migration work (could help whoever investigates).

I have the same problem
Grafana version 8.3.3
Victoria Metrics version 1.71.0

t=2021-12-23T02:23:23+0000 lvl=eror msg="Exemplar query failed" logger=tsdb.prometheus query="process_files_open_files{application="demo"}" err="[]v1.ExemplarQueryResult: decode slice: expect [ or n, but found , error found in #0 byte of ...||..., bigger context ...||..."

Also seeing this when connecting to New Relic as a Prometheus Datasource

"unmarshalerDecoder: model.Matrix: model.SampleStream.Metric: unmarshalerDecoder: "entity.guid" is not a valid label name, error found in #10 byte of ...|tity.guid":"NDU5NDYz|..., bigger context 

I've been looking for the bug using git bisect. Based on the fact that grafana was working fine with v8.2.5 and the bug has been seen by many users in v8.3.0 (see commits between tags), we are able to do the following analysis.

The scenario was done on a fresh debian 10 install with a Grafana built from source following the developer guide (see long version below).
On this brandnew grafana instance, I've recreate a prometheus datasource (same one as the one I've witness the bug the first time), import one dashboard then iterate between git bisect commands, refresh in your web browser, yarn install --immutable && yarn start and make run.

According to git bisect on the test scenario described above the bug was introduce in the commit 7140867.

The scenario might not be perfect, as such the outcome could be wrong.
I would be happy to give it another try if we found out that I've missed a critical step. Feel free to say so.

Here is the output of git bisect log at the end:

# bad: [914fcedb72c5e1bd6752f8f311b14df9cc7f7281] "Release: Updated versions in package to 8.3.0" (#42523)
# good: [b57a137acd75de86455dbbec89c00ae86a8bde08] "Release: Updated versions in package to 8.2.5" (#41859)
git bisect start 'v8.3.0' 'v8.2.5'
# good: [fc632276f1b04c20fe2c914067a64a0b8d1598b5] Update release notes and what’s new links (#39277)
git bisect good fc632276f1b04c20fe2c914067a64a0b8d1598b5
# bad: [49dee63453cf4e51da0a86ddbee89b831a634481] added ownership of plugins management code to the plugins platform frontend squad. (#40939)
git bisect bad 49dee63453cf4e51da0a86ddbee89b831a634481
# good: [ff9ad7ad20c060a97a2ecd30ae0e90e82d83af68] Schema: use the generated graph.gen.ts (#40090)
git bisect good ff9ad7ad20c060a97a2ecd30ae0e90e82d83af68
# bad: [a531c6e26f5beeb9dbaf6db28ef83635cc0a12d7] frontend logging fixes (#39946)
git bisect bad a531c6e26f5beeb9dbaf6db28ef83635cc0a12d7
# good: [30c1e7fa5cbda47f82ea01f2de91d34061a6da0f] A11y: Fix fastpass issues for /explore with gdev-testdata (#40309)
git bisect good 30c1e7fa5cbda47f82ea01f2de91d34061a6da0f
# good: [48eacd1ea6478ac35cd485cfe3e7ebd838a6c00b] Remove unused httpMethod (#40471)
git bisect good 48eacd1ea6478ac35cd485cfe3e7ebd838a6c00b
# bad: [73ac9c27179f5256cfc3184e3460eda5b86657a8] Update dependency @types/semver to v7 (#40515)
git bisect bad 73ac9c27179f5256cfc3184e3460eda5b86657a8
# good: [060a16041d6408c2bcfc6b50b67f0f9e815eb4e0] Fix typo in whats-new-in-v8-0.md (#38661)
git bisect good 060a16041d6408c2bcfc6b50b67f0f9e815eb4e0
# good: [8e070d6858f0d3fae72d5e26f57c23128bced4b9] influxdb: config page: typescript-fix for strict-mode (#40465)
git bisect good 8e070d6858f0d3fae72d5e26f57c23128bced4b9
# good: [58fdb717bad853a4e2ab87fbd383d1094c8ce506] Update dependency @types/react to v17 (#40440)
git bisect good 58fdb717bad853a4e2ab87fbd383d1094c8ce506
# bad: [6dc21d5899858f5dac41c4f4640d76a37d4c402d] Refactor: Decouple Label Browser from LocalStorage (#40449)
git bisect bad 6dc21d5899858f5dac41c4f4640d76a37d4c402d
# bad: [71408678685dac406fc54423f392f5e3940e93c8] Prometheus: Run dashboard queries trough backend (#40333)
git bisect bad 71408678685dac406fc54423f392f5e3940e93c8
# first bad commit: [71408678685dac406fc54423f392f5e3940e93c8] Prometheus: Run dashboard queries trough backend (#40333)

Long version on debian 10 (in case someone try to reproduce):

  • Preparatory steps
apt install git build-essential gcc g++ make gnupg
rm -rf /usr/local/go &&  tar -C /usr/local -xzf go1.17.6.linux-amd64.tar.gz
echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.bashrc

curl -fsSL https://deb.nodesource.com/setup_16.x | bash -
curl -sL https://dl.yarnpkg.com/debian/pubkey.gpg | gpg --dearmor | tee /usr/share/keyrings/yarnkey.gpg >/dev/null
echo "deb [signed-by=/usr/share/keyrings/yarnkey.gpg] https://dl.yarnpkg.com/debian stable main" | tee /etc/apt/sources.list.d/yarn.list

apt-get update && apt-get install yarn nodejs

echo 'export GOPATH=$HOME/go/' >> ~/.bashrc
echo 'export NODE_OPTIONS=--max_old_space_size=8000' >> ~/.bashrc
source ~/.bashrc
  • Get the sources
git clone https://github.com/grafana/grafana.git
cd grafana
  • Frontend - in shell 1
yarn install --immutable && yarn start

  • Backend - in shell 2
make run

  • Browser
navigate and create datasource
reproduce the bug
profit
  • Bisect - in shell 3
git bisect start v8.3.0 v8.2.5

hi all, thanks for the info provided. to explain what exactly is happening, and why:

  • in Prometheus, label names cannot contain . ( https://prometheus.io/docs/concepts/data_model/ : "Label names may contain ASCII letters, numbers, as well as underscores. ")
  • this is enforced by the prometheus go client library ( https://github.com/prometheus/client_golang ) that we use, which rejects data with label-names with dots in them
  • it seems other, prometheus-compatible databases allow dots in label names
  • this is why it does not work in grafana 8.3.0+
  • the reason this worked before grafana 8.3.0 is that before that version, prometheus-data was processed differently, and did not involve the prometheus go client library (this applies to dashboard&explore queries. alert-queries were like this even before 8.3.0)

this issue will need more investigation, to decide how to best handle it. just wanted to give you info about what is happening. thanks.

Sorry for causing a regression/impact here. Sadly I don't think there is a quick solution here since we are using the official Prometheus go client which correctly rejects label names that are not allowed as specified. If you want to change the Prometheus label naming rules then it needs to be requested to the Prometheus community, https://github.com/prometheus/client_golang , https://prometheus.io/community/

ok, maybe another solution - add separate datasource for VictoriaMetrics with all functionality of MetricsQL? https://github.com/VictoriaMetrics/metrics

I am going to change this to a feature request. Even though I understand it's not a great situation, it's victoria metrics (and others) that seem to break Prometheus standards so it's not something that we would consider has to be officially supported and that it was supported before was more an undefined behavior.

Also, there isn't an easy fix as it does not seem like we can configure prometheus client to allow dots in label names. Possible solutions to this:

  • Use a custom client that would support prometheus but also its flavors. At some point, we may need some switch in prometheus configuration saying what is actually on the backend because even though they try to be prometheus compatible there are obviously subtle differences between the backends.
  • Have a totally separate data source, either for specific backend like victoria metrics or somewhat generic for more flavors. Seem like this would be very similar amount of work as the first solution as we would still need a custom client and possibly also support multiple flavors in one data source anyway.

Same Problem: dotted label names (like issue #42615) not accepted
VictoriaMetrics v1.82.1 (latest)
Grafana 9.2.0 (latest)

unmarshalerDecoder: model.Matrix: model.SampleStream.Metric: unmarshalerDecoder: "com.capgemini.productionline.Description" is not a valid label name, error found in #10 byte of ...|scription":"PL Base |..., bigger context

The Dashboard was imported from (see below) which runs fine there
VictoriaMetrics v1.6.0
Grafana 8.0.3

Solved: Just figured out:

  • VictoriaMetrics v1.82.1 (latest)
  • Grafana 9.2.0 (latest)

there is a cmd-parameter for Victoriametrics which solved my problem:

(excerpt from docker-compose file)
command: - "--usePromCompatibleNaming"

This parameter exchanges all dots with underscores in labels and metric names...

But: you have to drop all old Victoriametrics-data and create a new Victoriametrics "database"

original comment: (see end of page: https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html)
"Whether to replace characters unsupported by Prometheus with underscores in the ingested metric names and label names. For example, foo.bar{a.b='c'} is transformed into foo_bar{a_b='c'} during data ingestion if this flag is set. See https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels"

Have a totally separate data source, either for specific backend like victoria metrics or somewhat generic for more flavors

The separate data source for VictoriaMetrics is in-progress and was submitted for initial review to https://grafana.com.

Closing, this feature request is only applicable for Victoria metrics, which has its own datasource plugin now