db: Metrics.WAL.BytesWritten appears bogus
jbowens opened this issue · comments
Jackson Owens commented
This metric spikes orders of magnitude beyond WAL.BytesIn, even orders of magnitude beyond bytes flushed or compacted, and beyond the node-level bytes written.
Internal slack link: https://cockroachlabs.slack.com/archives/C06TG0C6VGS/p1712855654340209?thread_ts=1712673101.370579&cid=C06TG0C6VGS
Jackson Owens commented
Taking the rate:
rate(storage_wal_bytes_written{cluster="$cluster",instance=~"$instances"}[$__rate_interval])
The raw counter values:
storage_wal_bytes_written{cluster="$cluster",instance=~"$instances"}
I don't really understand the rate graph. The counter metric appears to violate monotonicity, which seems like it's probably the source of the issue. I don't have enough prometheus understanding to say why the small regression in monotonicity seems to result in such a large increase in rate.