after loadBalancerService client init failed,we should give failed message to openstackcluster status
Goend opened this issue · comments
/kind bug
What steps did you take and what happened:
such as failed to create load balancer service client: No suitable endpoint could be found in the service catalog.
What did you expect to happen:
openstack cluster should give an error when this error cannot be automatically recovered.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
- Cluster API Provider OpenStack version (Or
git rev-parse HEAD
if manually built):
latest - Cluster-API version:
latest - OpenStack version:
- Minikube/KIND version:
- Kubernetes version (use
kubectl version
):
1.20.14 - OS (e.g. from
/etc/os-release
):
code is
cluster-api-provider-openstack/controllers/openstackcluster_controller.go
Lines 726 to 729 in 4ab8b3a
maybe we shoud add some code like this
handleUpdateOSCError(openStackCluster, errors.Errorf("failed to reconcile load balancer: %v", err))
Makes sense to me. @Goend, will you propose a PR fixing this?
But first, we need to confirm that this error is terminal Failure. Under this condition, I can submit a PR. @dulek
Therefore, we first need to find someone to confirm whether this issue is a bug.
We may need more input from the community。
But first, we need to confirm that this error is terminal Failure. Under this condition, I can submit a PR. @dulek Therefore, we first need to find someone to confirm whether this issue is a bug. We may need more input from the community。
Alright, let's try to analyze this here. The problem is that OpenStackCluster enabled a load balancer, but the cloud doesn't have an Octavia endpoint, so it's impossible to fulfill this obligation. We could silently ignore that and just go on without creating the LB, but that would mean we're implicitly ignoring user's request. That's not really something I'd do, it's better to explicitly tell user that something doesn't work.
Given this assumptions - this feels like a pretty terminal failure, unless we'd like to wait until the cloud is updated with Octavia installation. Getting Octavia installed doesn't exactly sound like something that happens over cluster installation timeout, so I'd say it's terminal.
fine,I will propose a PR to fix it