errors from Prometheus?

Question

errors from Prometheus?

bridgetkromhout opened this issue 6 years ago · comments

Looking at http://70.37.55.196:31277/#!/pod/default/winsome-wasp-prometheus-alertmanager-784d9bddf6-4pcqw?namespace=default I see this:

I ran through that section starting at https://oscon2018.container.training/#325 pretty quickly so maybe I missed something, but I'm left with two questions:

Is this expected - the persistent volumes claim error and related percentage unavailable?

What am I missing in the Prometheus section? Shouldn't I have a URL I send people to, to look at it?

Jérôme Petazzoni · Answer 1 · Tue Jul 17 2018 23:14:54 GMT+0800 (China Standard Time)

If my memory serves me well, there are two components (I think they are deployments but I'm not 100% sure so I'm using vague terminology on purpose here) that require persistent volumes in the Prometheus Helm Chart: the prometheus server itself (for data retention), and ... maybe the alerter or something?

So, trying to deploy it on a cluster without persistence basically fails. Which is why I added the flags to disable persistence for the Prometheus server itself: it lets it start and collect data (even in a fragile way). The other component doesn't start, but we don't care about it.

Let me know if you'd like me to dig more to solidify this explanation!

(I'll definitely update that section later, for sure.)

Bridget Kromhout · Answer 2 · Tue Jul 17 2018 23:32:12 GMT+0800 (China Standard Time)

So, trying to deploy it on a cluster without persistence basically fails

So, the failure is expected. That's cool! I just wanted to make sure I wasn't uncovering a Surprise Failure.