neondatabase / autoscaling

Postgres vertical autoscaling in k8s

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

scheduler plugin's failed scheduling metrics should include preferred AZ of the pod, if available

sharnoff opened this issue · comments

Problem description / Motivation

From the current metrics related to scheduling failures (PostFilter and Unreserve method calls), it's hard to tell the scope of any impact. Being able to see the affected AZs would significantly help with after-the-fact analysis and in-the-moment triage.

Feature idea(s) / DoD

Relevant scheduler metrics have a label for the availability zone. We may also want to consider node and node group labels. We may also want to survey the other metrics we expose, to see if they'd benefit from these labels as well.