[feature request] Improve the implementation of workloads creating in yurtappset
vie-serendipity opened this issue · comments
What would you like to be added:
When slowstartbatch encounters its first error it simply terminates, ideally it tries to create all the workloads first and resolves problems at the next reconcile after encountering the problem. So improve the current implementation.
openyurt/pkg/yurtmanager/controller/yurtappset/yurtappset_controller.go
Lines 368 to 370 in 34b14cc
Why is this needed:
When something unexpected happens, we want most of the workloads to work first, while it is likely to get stuck in one workload and keep looping according to the current implementation.
others
/kind feature
@vie-serendipity It seems that the current slowstartbatch fail policy is reasonable. because it is good to fail fast before the problem is solved.
@vie-serendipity Would you like to tell me more details about this issue?
@rambohe-ch I'm not sure, but I'd like to take a fail-tolerant strategy at reconcile to achieve the desired state of the spec as much as possible, i.e., continue to try to create other resources after failing to create one. @luc99hen What do you think?
Actually, this is the same logic used by replicaset to create pod. Kubernetes adopt fail-fast policy, so it's the same here.