carvel-dev / kapp

kapp is a simple deployment tool focused on the concept of "Kubernetes application" — a set of resources with the same label

Home Page:https://carvel.dev/kapp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intermitent failure when executing kapp deploy command

abubakarm94 opened this issue · comments

What steps did you take:
Occasionally, executing kapp deploy command fails with the following error

What happened:

      if [ "$(kubectl api-resources --api-group=kappctrl.k14s.io -o name | wc -l)" -eq 0 ] ; then \
      	kapp deploy \
      		--yes \
      		--app=kc \
      		--namespace=kube-system \
      		--file=https://github.com/vmware-tanzu/carvel-kapp-controller/releases/latest/download/release.yml ; \
      fi
      Target cluster 'https://<omitted>:<port>' (nodes: <omitted>, 3+)
        kapp: Error: error converting YAML to JSON: yaml: line 8: mapping values are not allowed in this context

Kapp version: kapp version 0.47.0

@abubakarm94 looks like we missed this issue. this seems a kapp issue, so im going to transfer it to that repo...

@abubakarm94 Thank you for creating the issue.
kapp: Error: error converting YAML to JSON: yaml suggests that something is wrong in the yaml provided to kapp, but I don't see any issue with the kapp-controller release.yml. Does the error happen while using https://github.com/vmware-tanzu/carvel-kapp-controller/releases/latest/download/release.yml or with a modified version of the yaml?

Hi @praveenrewar, It happens when using https://github.com/vmware-tanzu/carvel-kapp-controller/releases/latest/download/release.yml

I see. Just a few more follow up questions while I am trying to reproduce the issue...

  • Does the issue happen randomly or under some specific conditions.
  • Does it get resolved automatically by re running the command?
  • I am also curious about your specific use case :) (as you are using an if condition so I am just guessing that you have some sort of a script running on regular basis?)

Hi @praveenrewar, happy to help. here are my responses to your questions:

  • It happens randomly, I've not observed that it happens under certain conditions.
  • It does get resolved if I continue to re-run the command, it takes up to 3-4 tries on occasion.
  • We have a test on our CI that attempts to install a particular application via Kapp. And if Kapp is already installed on the test cluster, it doesn't make sense to reinstall it, hence the need for the IF condition.

Feel free to let me know if you have any other questions.

Hey @abubakarm94 Thanks a lot for the quick answers.

We have a test on our CI that attempts to install a particular application via Kapp. And if Kapp is already installed on the test cluster, it doesn't make sense to reinstall it, hence the need for the IF condition.

I see. Initially I thought that some rate limiting or networking issue might be causing it, but I think that that would have lead to a different error message.

At this point I am not sure what could be causing this issue, so throwing in a couple of more questions (which might sound silly)

  • How frequently does the error occur and how frequently does the job run?
  • Is the error always the same (including the line number) ?

I will try to reproduce this using a script. Meanwhile, will it be possible for you to download the file locally and then use that all the time?

Not silly questions at all.

  • The error doesn't occur frequently, it's possible to go weeks without seeing it. However, on the days it occurs, it happens multiple times.
  • And yes, the error is always the same.

I'll update our script to download the file and use it the next time it happens. Thanks for the suggestion.

i bet this would correlate to github having occasional spike of errors/outages for downloading assets. looking at the code seems like we are missing status 2xx check: https://github.com/vmware-tanzu/carvel-kapp/blob/e0bf1c9d485c5c59a23b4cfe75915782e7fb4b42/pkg/kapp/resources/file_sources.go#L58-L72. /cc @jtigger same in ytt.

i bet this would correlate to github having occasional spike of errors/outages for downloading assets

I did think of something similar, but couldn't find any issue related to this anywhere, but this is the only reason that makes sense.

Adding reproduction steps for something similar:

$ kapp deploy -a test -f https://github.com/fake/path/
Target cluster 'https://192.168.64.93:8443' (nodes: minikube)

kapp: Error: error converting YAML to JSON: yaml: line 163: mapping values are not allowed in this context

Expected error should be something like this:

$ /kapp deploy -a test -f https://github.com/fake/path/                                      
Target cluster 'https://192.168.64.93:8443' (nodes: minikube)

kapp: Error: Requesting URL 'https://github.com/fake/path/': 404 Not Found

Fix available in v0.49.0