trinodb / trino-gateway

Problem: Currently, if the backends are not responding to healthcheck, trino-gateway will inactivate the backends. Since healthcheck only checks against active backends, inactive backends will never be turned to active again.

Solution: it is better to introduce healthy/pending/unhealthy states depending on the health of the backends. These states will be independent from the active/inactive states, so it ensures that unhealthy but active backends can still be checked and possibly turned active again.

Healthy state is defined as backends are healthy and ready to be served
Pending state is defined as when the backend is switched from inactive to active. It will wait until healthcheck returns success before turning backend to healthy
Unhealthy state is defined as backends are returning error or not responding to healthcheck. At this point, if the backends are still active, healthcheck will still check on these backends.

When backend is active and health check passes, health state should be HEALTHY

When backend is inactive, health state should be PENDING

When backend is turned active from inactive, health state should be PENDING until health check passes

Instead of deactivating backends, we could switch to using the TrinoQueueLengthRoutingTable as default. This routing strategy will automatically ignore any backend that does not respond with the number of running and queued queries.

At first I thought gateway only checks against active backends because when user disables backends, e.g for deployment purpose, it should not go back to enabled status automatically. A manual interaction is needed.

unhealthy but active backends can still be checked and possibly turned active again.

So, if i understood it correctly, healthy status checks only to those who has active status as true right? Those in active as false is not checked? Am just clarifying what you intended :)

At first I thought gateway only checks against active backends because when user disables backends, e.g for deployment purpose, it should not go back to enabled status automatically. A manual interaction is needed.

unhealthy but active backends can still be checked and possibly turned active again.

So, if i understood it correctly, healthy status checks only to those who has active status as true right? Those in active as false is not checked? Am just clarifying what you intended :)

this is correct. ActiveClusterMonitor only checks against active backends.

trino-gateway/gateway-ha/src/main/java/io/trino/gateway/ha/clustermonitor/ActiveClusterMonitor.java

Line 62 in e155380

List<ProxyBackendConfiguration> activeClusters =

This feature allows active backends to revive and starts getting requests by itself if it was unhealthy for any reason.

Instead of deactivating backends, we could switch to using the TrinoQueueLengthRoutingTable as default. This routing strategy will automatically ignore any backend that does not respond with the number of running and queued queries.

@willmostly I think it adds value to UX improvement because deactivating backends also shows backends' health as "unhealthy" on UI. It lets users know that the backends are down at the moment.

Open to any other discussions.

Instead of deactivating backends, we could switch to using the TrinoQueueLengthRoutingTable as default. This routing strategy will automatically ignore any backend that does not respond with the number of running and queued queries.

@willmostly out of curiosity what would this implementation do if the active node count is 0 for a coordinator, would it still route because running == queued == 0?

my vote (albeit not important) is the current implementation you've went with because it doesn't affect routing and fixes (IMO) a broken functionality where backends can't self heal

The only risky part of this implementation that I can see is that a flapping backend can continually rejoin the routing, accept + fail queries, drop as unhealthy, repeat. But I believe the benefit of a healthy backend being able to rejoin routing after a false positive health failure out weights the cons.

We actually implemented this feature on our internal fork of trino-gateway, I'll see about getting it cleared and uploaded here as a branch if there's any interest. This feature was particularly important for our deployment as we have this deployed on k8s and we'll have redeployments of our trino clusters that would bring them in and out of "active".

-edit-

just noticed that you had a pr open for this! so cool

Feature request: Introduce different states to prevent clusters from being turned inactive programmatically