inadarei / rfc-healthcheck

Health Check Response RFC Draft for HTTP APIs

Home Page:https://inadarei.github.io/rfc-healthcheck/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

In Details, report threshold

christianhujer opened this issue · comments

The Details structure could report an optional theshold to understand at what level the status changed from pass to warn.

This would change 4.4. second sentence to "Clarifies the unit of measureent in which observedValue and threshold are reported, [...]".

This would add a section to chapter 4. Details:
"4.X thresholdValue
thresholdValue: (optional) could be any valid JSON value, such as: string, number, object, array or literal. This value is used to tell the value above or below which the observedValue would change the status from pass to warn."

I had the idea when looking at the cpu utilization in the example and thought of implementing it.

Our team had a discussion about reporting threshold values this morning and reached the consensus that it was important to have both the warn and fail thresholds reported even when the component status is pass. If you're passing, how close are you to the warn. If you're current status is warn, why? And how close are you from the fail threshold?

So we'd like to see the a failThreshold and a passThreshold on every numeric check.

Interesting idea, but I think it should be failRange and passRange otherwise directionality etc. may become a problem and range seems to provide more flexibility, anyway

@inadarei Is there a need to account for hysteresis? In any case, our team has implemented this under the auspices of section 4.10 of the specification. Clearly the specification can't provide named fields for everything someone might need but this request might be common enough to warrant adding to the spec.

Can you share details on how you implemented failThreshold and passThreshold? How does a client know if "more than" that value is good, or "less than" that value is good?

Thank you

They're more intended for humans to read ... the request was "we want to know how close to a warn or fail we are". The idea of a range is better for machine processing but we'd also need the concept of open-ended and closed ranges.