openyurtio / openyurt

OpenYurt - Extending your native Kubernetes to edge(project under CNCF)

Home Page:https://openyurt.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[feature request] Improve the implementation of workloads creating in yurtappset

vie-serendipity opened this issue · comments

What would you like to be added:
When slowstartbatch encounters its first error it simply terminates, ideally it tries to create all the workloads first and resolves problems at the next reconcile after encountering the problem. So improve the current implementation.

// 1. create workloads
if len(needCreateNodePools) > 0 {
createdNum, createdErr := util.SlowStartBatch(len(needCreateNodePools), slowStartInitialBatchSize, func(idx int) error {

Why is this needed:
When something unexpected happens, we want most of the workloads to work first, while it is likely to get stuck in one workload and keep looping according to the current implementation.

others
/kind feature

@vie-serendipity It seems that the current slowstartbatch fail policy is reasonable. because it is good to fail fast before the problem is solved.

@vie-serendipity Would you like to tell me more details about this issue?

@rambohe-ch I'm not sure, but I'd like to take a fail-tolerant strategy at reconcile to achieve the desired state of the spec as much as possible, i.e., continue to try to create other resources after failing to create one. @luc99hen What do you think?

Actually, this is the same logic used by replicaset to create pod. Kubernetes adopt fail-fast policy, so it's the same here.