uswitch / kiam

Integrate AWS IAM with Kubernetes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Requesting additional docs around the exposed metrics

ryan-dyer-sp opened this issue · comments

The metrics currently exposed by kiam are found here: https://github.com/uswitch/kiam/blob/master/docs/METRICS.md

However some of the descriptions around these metrics leave alot to be desired as an operator (and not a Kiam developer).

For example if I am wanting to write alarms around these metrics...
kiam_sts_issuing_errors_total - I created a deploy with a role that didnt exist. the pod itself didnt attempt to do anything AWS related. This resulted in a single increment of this value. Seems a legit thing to be on the lookout for, but I expected more than just a single failure.

kiam_metadata_credential_fetch_errors_total, kiam_metadata_credential_encode_errors_total, kiam_metadata_find_role_errors_total, and kiam_metadata_empty_role_total - these all seem like things to be on the lookout for, but what actually causes them? What are the scenarios where these may fire off? And as a user can we even do anything about them? Or are they an indication that something on the AWS backend is out of whack? Is one an indication that the KIAM server is having issues with its creds talking with IAM to begin with?

Any additional information around the use cases that could result in these metrics firing would be greatly appreciated.