carvel-dev / kapp

kapp is a simple deployment tool focused on the concept of "Kubernetes application" — a set of resources with the same label

Home Page:https://carvel.dev/kapp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Skip checking resources when `--wait=false` is specified

firgavin opened this issue · comments

What steps did you take:
I currently use Kapp as a CI tool to manage lots of YAML files. I used --wait=false when I deleted the app because sometimes deleting custom resources will take a long time.

What happened:
kapp exits with non-zero code which makes CI fail.

$ kapp delete -a app1 --wait=false -y
Target cluster 'https://127.0.0.1:6443' (nodes: firgavin)

Changes

Namespace  Name        Kind        Age  Op      Op st.  Wait to  Rs  Ri  
default    simple-app  Deployment  22s  delete  -       -        ok  -  
^          simple-app  Service     22s  delete  -       -        ok  -  

Op:      0 create, 2 delete, 0 update, 0 noop, 0 exists
Wait to: 0 reconcile, 0 delete, 2 noop

11:18:59AM: ---- applying 2 changes [0/2 done] ----
11:18:59AM: delete deployment/simple-app (apps/v1) namespace: default
11:18:59AM: delete service/simple-app (v1) namespace: default
11:18:59AM: ---- waiting on 2 changes [0/2 done] ----
11:18:59AM: ok: noop service/simple-app (v1) namespace: default
11:18:59AM: ok: noop deployment/simple-app (apps/v1) namespace: default
11:18:59AM: ---- applying complete [2/2 done] ----
11:18:59AM: ---- waiting complete [2/2 done] ----

kapp: Error: Expected all resources to be gone, but found: endpointslice/simple-app-vp2dw (discovery.k8s.io/v1) namespace: default, pod/simple-app-64c66864f5-g9sb8 (v1) namespace: default, replicaset/simple-app-64c66864f5 (apps/v1) namespace: default

What did you expect:
Kapp could skip checking resources when --wait=false is specified.

Anything else you would like to add:
I did some research and I found that kapp checks the existence of related resources after applying changes. But resources will be deleted eventually. See https://github.com/vmware-tanzu/carvel-kapp/blob/v0.52.0/pkg/kapp/cmd/app/delete.go#L159.
It would be great if kapp could default to skipping checking resources when --wait=false is specified or add a flag to control this logic. And if that makes sense, I'd like to help implement this ;)

Environment:

  • kapp version (use kapp --version): v0.52.0
  • OS (e.g. from /etc/os-release): Ubuntu 20.04.4 LTS
  • Kubernetes version (use kubectl version): v1.23.6+k3s1

Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible"
👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

Yeah, it seems like setting the wait flag to false would currently lead to an error while deleting recorded apps. So definitely it's a bug.

It would be great if kapp could default to skipping checking resources when --wait=false is specified or add a flag to control this logic.

It does makes sense to allow that behaviour, I am just trying to think of any side effects it could have. One obvious thing that could happen is that one or more resources are not deleted but the app itself (metadata configmap) is deleted.
@cppforlife Any thoughts?

And if that makes sense, I'd like to help implement this ;)

That would be great, we will definitely review it on priority once we finalize the approach :)

Hey @firgavin good to see your here. Looking forward to your PR for this issue.

One obvious thing that could happen is that one or more resources are not deleted but the app itself

This would be a "known risk" I guess?

We might also lose out on some "retryable cases", where kapp would retry in case of a failed delete due to a retryable error.

I did some research and I found that kapp checks the existence of related resources after applying changes. But resources will be deleted eventually.

i think additional flag would be reasonable to disable this check. may be under dangerous?

i think additional flag would be reasonable to disable this check. may be under dangerous?

This approach makes sense to me

Hi @cppforlife, @100mik, @praveenrewar - Thanks for your insights! Here's my proposal:

We can add a flag --dangerous-disable-checking-app-deletion to enable or disable the check:

  • The value is set to false by default, which is compatible with the current behavior.
  • Once the flag is specified, kapp skips this check, and users might need to manually delete related resources.

Before I work on it, I'd like to discuss the interaction between the two flags. When --dangerous-disable-checking-app-deletion=false, should we make sure that the value of --wait is overwritten to True? If not, users can still hit the same issue. Of course, we can explain the usage in the docs if we think they should be "orthogonal". Any suggestions?

When --dangerous-disable-checking-app-deletion=false, should we make sure that the value of --wait is overwritten to True?

I think that we should keep the working of these 2 flags independent of each other because a user should be able to use --dangerous-disable-checking-app-deletion irrespective of --wait being enabled or disabled.

If not, users can still hit the same issue. Of course, we can explain the usage in the docs if we think they should be "orthogonal". Any suggestions?

Maybe we can add a hint in the error message?