[Flaky Test] HA Multi zones tests timeout frequently
MichaelEischer opened this issue · comments
How to categorize this issue?
/area testing
/kind flake
Which test(s)/suite(s) are flaking:
The tests run by the ProwJob pull-gardener-e2e-kind-ha-multi-zone
. More specifically [It] Shoot Tests Shoot with workers Create, Update, Delete [Shoot, default, basic, simple]
.
CI link:
https://prow.gardener.cloud/view/gs/gardener-prow/pr-logs/pull/gardener_gardener/9449/pull-gardener-e2e-kind-ha-multi-zone/1779763882739372032
https://testgrid.k8s.io/gardener-gardener#ci-gardener-e2e-kind-ha-multi-zone , for example https://prow.gardener.cloud/view/gs/gardener-prow/logs/ci-gardener-e2e-kind-ha-multi-zone/1779501481980858368
Reason for failure:
Apparently there are too many machines: {"level":"info","ts":"2024-04-14T18:03:00.661Z","logger":"shoot-test.test","msg":"Shoot is not yet reconciled","shoot":{"name":"e2e-default","namespace":"garden-local"},"reason":"condition type EveryNodeReady is not true yet, had message too many worker nodes are registered. Exceeding maximum desired machine count (4/3) with reason NodesScalingDown"}
I've noticed the flaky test as part of #9449, but the test runs on testgrid also contain the exact same error message.
The Gardener project currently lacks enough active contributors to adequately respond to all issues.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale
- Mark this issue as rotten with
/lifecycle rotten
- Close this issue with
/close
/lifecycle stale