knyar / nginx-lua-prometheus

Prometheus metric library for Nginx written in Lua

Repository from Github https://github.comknyar/nginx-lua-prometheusRepository from Github https://github.comknyar/nginx-lua-prometheus

Character Injection Causing Malformed Metrics

srguglielmo opened this issue · comments

commented

Hello! There seems to be a bug in nginx-lua-prometheus that allows certain characters in a HTTP request to pass unescaped to the <host>:9145/metrics endpoint. This then causes Prometheus to detect the metrics endpoint as DOWN with the error "unsupported character in float". I was able to reproduce this using Docker images. The versions are OpenResty 1.21.4.1, Prometheus 2.36.2, and nginx-lua-prometheus 0.20220527.

Given the following files in the same directory:

docker-compose.yaml:

services:
  nginx-lua-prometheus:
    build: .
    container_name: nginx-lua-prometheus
    ports:
      - "8080:8080/tcp" # Nginx's built-in stub_status (/stub_status).
      - "9145:9145/tcp" # Exposed by nginx-lua-prometheus (/metrics).
  prometheus:
    container_name: prometheus
    image: prom/prometheus:v2.36.2
    ports:
      - "9090:9090/tcp"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

Dockerfile:

FROM openresty/openresty:1.21.4.1-alpine-fat
RUN /usr/local/openresty/bin/opm get knyar/nginx-lua-prometheus=0.20220527
COPY nginx-lua-prometheus.conf /etc/nginx/conf.d/

nginx-lua-prometheus.conf:

lua_shared_dict prometheus_metrics 10M;
lua_package_path "/usr/local/openresty/site/lualib/?.lua;;";
init_worker_by_lua_block {
  prometheus = require("prometheus").init("prometheus_metrics")
  metric_active = prometheus:gauge("nginx_active", "Nginx Instance Active", {})
  metric_requests = prometheus:counter("nginx_http_requests_total", "Number of HTTP requests", {"host", "status", "path"})
  metric_bytes = prometheus:counter("nginx_http_bytes_total", "Number of bytes in HTTP request and response", {"host"})
  metric_latency = prometheus:histogram("nginx_http_request_duration_seconds", "HTTP request latency", {"host", "path"})
  metric_connections = prometheus:gauge("nginx_http_connections", "Number of HTTP connections", {"state"})
}
log_by_lua_block {
  metric_requests:inc(1, {ngx.var.server_name, ngx.var.status, ngx.var.uri})
  metric_latency:observe(tonumber(ngx.var.request_time), {ngx.var.server_name, ngx.var.uri})
}

server {
  listen 9145;
  allow all;
  location /metrics {
    content_by_lua_block {
      metric_connections:set(ngx.var.connections_active, {"active"})
      metric_connections:set(ngx.var.connections_reading, {"reading"})
      metric_connections:set(ngx.var.connections_waiting, {"waiting"})
      metric_connections:set(ngx.var.connections_writing, {"writing"})
      prometheus:collect()
    }
  }
}

server {
    listen 8080;
    location = /stub_status {
        stub_status;
    }
}

prometheus.yml:

scrape_configs:
  - job_name: nginx-lua-prometheus-scraper
    dns_sd_configs:
      - names:
          - nginx-lua-prometheus
        type: A
        port: 9145

To reproduce the bug:

  1. Execute docker compose up.
  2. In a browser on the same machine, browse to http://localhost:9090/targets
  3. Wait/Refresh the page until Prometheus has scraped the nginx-lua-prometheus-scraper endpoint and marks it as UP without error.
  4. Send a malformed HTTP request to OpenResty on port 8080: curl -v 'http://localhost:8080/test.txt%0d%0aSet-Cookie:CRLFInjection=Test%0d%0aLocation:%20example.com%0d%0aX-XSS-Protection:0' Note you will receive a 404 Not Found (as the "test.txt" file does not exist).

The container logs will show:

nginx-lua-prometheus  | 2022/07/07 23:44:58 [error] 7#7: *202 open() "/usr/local/openresty/nginx/html/test.txt
nginx-lua-prometheus  | Set-Cookie:CRLFInjection=Test
nginx-lua-prometheus  | Location: example.com
nginx-lua-prometheus  | X-XSS-Protection:0" failed (2: No such file or directory), client: 172.18.0.1, server: , request: "GET /test.txt%0d%0aSet-Cookie:CRLFInjection=Test%0d%0aLocation:%20example.com%0d%0aX-XSS-Protection:0 HTTP/1.1", host: "localhost:8080"
nginx-lua-prometheus  | 172.18.0.1 - - [07/Jul/2022:23:44:58 +0000] "GET /test.txt%0d%0aSet-Cookie:CRLFInjection=Test%0d%0aLocation:%20example.com%0d%0aX-XSS-Protection:0 HTTP/1.1" 404 159 "-" "curl/7.84.0"
  1. Browse to the Prometheus scrape targets (http://localhost:9090/targets) and notice that the endpoint is now DOWN with error "unsupported character in float".

The nginx-lua-prometheus endpoint directly at http://localhost:9145/metrics shows:

# HELP nginx_http_connections Number of HTTP connections
# TYPE nginx_http_connections gauge
nginx_http_connections{state="active"} 2
nginx_http_connections{state="reading"} 0
nginx_http_connections{state="waiting"} 1
nginx_http_connections{state="writing"} 1
# HELP nginx_http_request_duration_seconds HTTP request latency
# TYPE nginx_http_request_duration_seconds histogram
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.005"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.01"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.02"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.03"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.05"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.075"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.1"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.2"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.3"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.4"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.5"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="0.75"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="1"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="1.5"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="2"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="3"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="4"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="5"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="10"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/favicon.ico",le="+Inf"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.005"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.01"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.02"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.03"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.05"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.075"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.1"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.2"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.3"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.4"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.5"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="0.75"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="1"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="1.5"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="2"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="3"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="4"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="5"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="10"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/index.html",le="+Inf"} 1
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.005"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.01"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.02"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.03"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.05"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.075"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.1"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.2"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.3"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.4"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.5"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="0.75"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="1"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="1.5"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="2"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="3"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="4"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="5"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="10"} 6
nginx_http_request_duration_seconds_bucket{host="",path="/metrics",le="+Inf"} 6
X-XSS-Protection:0",le="0.005"} 1
X-XSS-Protection:0",le="0.01"} 1
X-XSS-Protection:0",le="0.02"} 1
X-XSS-Protection:0",le="0.03"} 1
X-XSS-Protection:0",le="0.05"} 1
X-XSS-Protection:0",le="0.075"} 1
X-XSS-Protection:0",le="0.1"} 1
X-XSS-Protection:0",le="0.2"} 1
X-XSS-Protection:0",le="0.3"} 1
X-XSS-Protection:0",le="0.4"} 1
X-XSS-Protection:0",le="0.5"} 1
X-XSS-Protection:0",le="0.75"} 1
X-XSS-Protection:0",le="1"} 1
X-XSS-Protection:0",le="1.5"} 1
X-XSS-Protection:0",le="2"} 1
X-XSS-Protection:0",le="3"} 1
X-XSS-Protection:0",le="4"} 1
X-XSS-Protection:0",le="5"} 1
X-XSS-Protection:0",le="10"} 1
X-XSS-Protection:0",le="+Inf"} 1
nginx_http_request_duration_seconds_count{host="",path="/favicon.ico"} 1
nginx_http_request_duration_seconds_count{host="",path="/index.html"} 1
nginx_http_request_duration_seconds_count{host="",path="/metrics"} 6
nginx_http_request_duration_seconds_count{host="",path="/test.txt
Set-Cookie:CRLFInjection=Test
Location: example.com
X-XSS-Protection:0"} 1
nginx_http_request_duration_seconds_sum{host="",path="/favicon.ico"} 0
nginx_http_request_duration_seconds_sum{host="",path="/index.html"} 0
nginx_http_request_duration_seconds_sum{host="",path="/metrics"} 0
nginx_http_request_duration_seconds_sum{host="",path="/test.txt
Set-Cookie:CRLFInjection=Test
Location: example.com
X-XSS-Protection:0"} 0
# HELP nginx_http_requests_total Number of HTTP requests
# TYPE nginx_http_requests_total counter
nginx_http_requests_total{host="",status="200",path="/index.html"} 1
nginx_http_requests_total{host="",status="200",path="/metrics"} 6
nginx_http_requests_total{host="",status="404",path="/favicon.ico"} 1
nginx_http_requests_total{host="",status="404",path="/test.txt
Set-Cookie:CRLFInjection=Test
Location: example.com
X-XSS-Protection:0"} 1
# HELP nginx_metric_errors_total Number of nginx-lua-prometheus errors
# TYPE nginx_metric_errors_total counter
nginx_metric_errors_total 0

I'd appreciate any assistance or guidance. Thank you!

Thanks a lot for the detailed report!

We currently escape double quotes and backslashes in label values, but looking at the openmetrics spec I think we also need to escape line breaks:

escaped-char = normal-char
escaped-char =/ BS ("n" / DQUOTE / BS)
escaped-char =/ BS normal-char

; Any unicode character, except newline, double quote, and backslash
normal-char = %x00-09 / %x0B-21 / %x23-5B / %x5D-D7FF / %xE000-10FFFF

Is this something you would be interested in preparing a PR for?

As a side note, using ngx.var.uri of a publicly available site as a metric label directly is not a great idea, since you basically allow anyone on the internet to increase your metric cardinality. Please see the following discussions for more detail: