scylladb / scylla-monitoring

Simple monitoring of Scylla with Grafana

Home Page:https://scylladb.github.io/scylla-monitoring/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug: Non-Paged CQL Reads Gauge isn't working.

DanielHe4rt opened this issue · comments

Installation details
Dashboard Name: Scylla CQL
Scylla-Monitoring Version: 4.6, 4.7 (tested)
Scylla-Version: Any

Problem

Grafana Dashboard

I was testing the Optimization Tab and saw that the "Non-Paged CQL Reads" gauge isn't working. After a couple hours debugging, I found out that the query is crashing due a [[by]] inside the statement that isn't present in any other gauge.

To make sure I replicated it into other versions and also at Scylla Cloud and we still have the same problem. Maybe it is affecting other versions, but it would be really good to check it out.

Here's the current query:


100 * (
    (
        sum(rate(
            scylla_cql_unpaged_select_queries{
                instance=~"[[node]]",
                cluster="$cluster", 
                dc=~"$dc", 
                shard=~"[[shard]]"
            }[1m]
        )) 
        - 
        sum(rate(
            scylla_cql_unpaged_select_queries_per_ks{
                ks="system", 
                instance=~"[[node]]", 
                cluster="$cluster", 
                dc=~"$dc", 
                shard=~"[[shard]]"
            }[1m]
        )) by ([[by]])
    )
    /
    sum(rate(
        scylla_cql_reads{
            instance=~"[[node]]",
            cluster="$cluster", 
            dc=~"$dc", 
            shard=~"[[shard]]"
        }[1m]
    ))
) OR vector(0)

The new query with the fix:


100 * (
    (
        sum(rate(
            scylla_cql_unpaged_select_queries{
                instance=~"[[node]]",
                cluster="$cluster", 
                dc=~"$dc", 
                shard=~"[[shard]]"
            }[1m]
        )) 
        - 
        sum(rate(
            scylla_cql_unpaged_select_queries_per_ks{
                ks="system", 
                instance=~"[[node]]", 
                cluster="$cluster", 
                dc=~"$dc", 
                shard=~"[[shard]]"
            }[1m]
        )) // by removed
    )
    /
    sum(rate(
        scylla_cql_reads{
            instance=~"[[node]]",
            cluster="$cluster", 
            dc=~"$dc", 
            shard=~"[[shard]]"
        }[1m]
    ))
) OR vector(0)

Hope that it helps somehow.

Hey @amnonh! I'm running the LTS 4.7 and still having this issue. Can you verify if it was merged in the other versions?

non-paged-gauge

Hey @amnonh! I'm running the LTS 4.7 and still having this issue. Can you verify if it was merged in the other versions?

4.7 isn't LTS (AFAIK) and the bug is fixed in 4.8 (not released yet).