argoproj / argo-workflows

Workflow Engine for Kubernetes

Home Page:https://argo-workflows.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Init container fails with `invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable`

zamamohx opened this issue · comments

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

When running the workflow/hello-world-argo-workflow, the init container init fails with the following error:

invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

This occurs while using in kubeflow cluster and i followed https://argo-workflows.readthedocs.io/en/latest/service-accounts/ as well but nothing works.

   Command:
      argoexec
      init
      --loglevel
      info
    State:          Terminated
      Reason:       Error
      Message:      invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
      Exit Code:    2
      Started:      Wed, 24 Apr 2024 10:56:30 -0700
      Finished:     Wed, 24 Apr 2024 10:56:30 -0700
    Ready:          False
    Restart Count:  0

Expected Behavior:
The init container should initialize successfully without encountering the mentioned error.

Steps to Reproduce:

Deploy the workflow/hello-world-argo-workflow.
Observe the failure of the init container in kubeflow cluster.

Version

argoexec:v3.3.10

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: hello-world-argo-workflow
spec:
  entrypoint: hello-world
  templates:
  - name: hello-world
    container:
      image: alpine:3.14
      command: [echo, "Hello, World!"]

Logs from the workflow controller

kubectl describe pod hello-world-argo-workflow -n kubeflow


Name:         hello-world-argo-workflow
Namespace:    kubeflow
Priority:     0
Node:         xxxxxxxxxxxxxxxxxxxxxxxxxx
Start Time:   Wed, 24 Apr 2024 13:40:22 -0700
Labels:       workflows.argoproj.io/completed=true
              workflows.argoproj.io/workflow=hello-world-argo-workflow
Annotations:  cni.projectcalico.org/containerID: 6e3636399f126537cd83dfcc9ed04077da1b0b501dae0bd26b5191b7c6433c9b
              cni.projectcalico.org/podIP:
              cni.projectcalico.org/podIPs:
              kubectl.kubernetes.io/default-container: main
              kubernetes.io/psp: unrestricted-psp
              workflows.argoproj.io/node-id: hello-world-argo-workflow
              workflows.argoproj.io/node-name: hello-world-argo-workflow
Status:       Failed
IP:           100.80.84.228
IPs:
  IP:           100.80.84.228
Controlled By:  Workflow/hello-world-argo-workflow
Init Containers:
  init:
    Container ID:  containerd://df52a59bb7411936cc013da317228036a280cbc838771cce7adcc36fd27deeb5
    Image:         gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance
    Image ID:      gcr.io/ml-pipeline/argoexec@sha256:70b419bd8334aeee278b49dd67b85aa69cea6cb9188c4b9fd5f3613039d77c30
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      init
      --loglevel
      info
    State:          Terminated
      Reason:       Error
      Message:      invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
      Exit Code:    2
      Started:      Wed, 24 Apr 2024 13:40:23 -0700
      Finished:     Wed, 24 Apr 2024 13:40:23 -0700
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-world-argo-workflow (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      ARGO_CONTAINER_RUNTIME_EXECUTOR:    emissary
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-world-argo-workflow
      ARGO_CONTAINER_NAME:                init
      ARGO_TEMPLATE:                      {xxxxxxxxxxxxxxxxxxxxxx}
      ARGO_NODE_ID:                       hello-world-argo-workflow
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /argo/secret/mlpipeline-minio-artifact from mlpipeline-minio-artifact (ro)
      /var/run/argo from var-run-argo (rw)
Containers:
  wait:
    Container ID:
    Image:         gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      wait
      --loglevel
      info
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-world-argo-workflow (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      ARGO_CONTAINER_RUNTIME_EXECUTOR:    emissary
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-world-argo-workflow
      ARGO_CONTAINER_NAME:                wait
      ARGO_TEMPLATE:                      {xxxxxxxxxxxxxxxxxxxxxx}
      ARGO_NODE_ID:                       hello-world-argo-workflow
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /argo/secret/mlpipeline-minio-artifact from mlpipeline-minio-artifact (ro)
      /var/run/argo from var-run-argo (rw)
  main:
    Container ID:
    Image:         alpine:3.14
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /var/run/argo/argoexec
      emissary
      --
      echo
      Hello, World!
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_CONTAINER_NAME:                main
      ARGO_TEMPLATE:                      {xxxxxxxxxxxxxxxxxxxxxx}
      ARGO_NODE_ID:                       hello-world-argo-workflow
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  var-run-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  mlpipeline-minio-artifact:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  mlpipeline-minio-artifact
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  66s   default-scheduler  Successfully assigned kubeflow/hello-world-argo-workflow to xxxxxxxxxxxx
  Normal  Pulled     58s   kubelet            Container image "gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance" already present on machine
  Normal  Created    58s   kubelet            Created container init
  Normal  Started    58s   kubelet            Started container init

Logs from in your workflow's wait container

Name:         hello-world-argo-workflow
Namespace:    kubeflow
Priority:     0
Node:         xxxxxxxxxxxxxxxxxxxxxxxxxx
Start Time:   Wed, 24 Apr 2024 13:40:22 -0700
Labels:       workflows.argoproj.io/completed=true
              workflows.argoproj.io/workflow=hello-world-argo-workflow
Annotations:  cni.projectcalico.org/containerID: 6e3636399f126537cd83dfcc9ed04077da1b0b501dae0bd26b5191b7c6433c9b
              cni.projectcalico.org/podIP:
              cni.projectcalico.org/podIPs:
              kubectl.kubernetes.io/default-container: main
              kubernetes.io/psp: unrestricted-psp
              workflows.argoproj.io/node-id: hello-world-argo-workflow
              workflows.argoproj.io/node-name: hello-world-argo-workflow
Status:       Failed
IP:           100.80.84.228
IPs:
  IP:           100.80.84.228
Controlled By:  Workflow/hello-world-argo-workflow
Init Containers:
  init:
    Container ID:  containerd://df52a59bb7411936cc013da317228036a280cbc838771cce7adcc36fd27deeb5
    Image:         gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance
    Image ID:      gcr.io/ml-pipeline/argoexec@sha256:70b419bd8334aeee278b49dd67b85aa69cea6cb9188c4b9fd5f3613039d77c30
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      init
      --loglevel
      info
    State:          Terminated
      Reason:       Error
      Message:      invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable
      Exit Code:    2
      Started:      Wed, 24 Apr 2024 13:40:23 -0700
      Finished:     Wed, 24 Apr 2024 13:40:23 -0700
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-world-argo-workflow (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      ARGO_CONTAINER_RUNTIME_EXECUTOR:    emissary
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-world-argo-workflow
      ARGO_CONTAINER_NAME:                init
      ARGO_TEMPLATE:                      {xxxxxxxxxxxxxxxxxxxxxx}
      ARGO_NODE_ID:                       hello-world-argo-workflow
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /argo/secret/mlpipeline-minio-artifact from mlpipeline-minio-artifact (ro)
      /var/run/argo from var-run-argo (rw)
Containers:
  wait:
    Container ID:
    Image:         gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      wait
      --loglevel
      info
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      hello-world-argo-workflow (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      ARGO_CONTAINER_RUNTIME_EXECUTOR:    emissary
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 hello-world-argo-workflow
      ARGO_CONTAINER_NAME:                wait
      ARGO_TEMPLATE:                      {xxxxxxxxxxxxxxxxxxxxxx}
      ARGO_NODE_ID:                       hello-world-argo-workflow
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /argo/secret/mlpipeline-minio-artifact from mlpipeline-minio-artifact (ro)
      /var/run/argo from var-run-argo (rw)
  main:
    Container ID:
    Image:         alpine:3.14
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /var/run/argo/argoexec
      emissary
      --
      echo
      Hello, World!
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_CONTAINER_NAME:                main
      ARGO_TEMPLATE:                      {xxxxxxxxxxxxxxxxxxxxxx}
      ARGO_NODE_ID:                       hello-world-argo-workflow
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /var/run/argo from var-run-argo (rw)
Conditions:
  Type              Status
  Initialized       False
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  var-run-argo:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  mlpipeline-minio-artifact:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  mlpipeline-minio-artifact
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  66s   default-scheduler  Successfully assigned kubeflow/hello-world-argo-workflow to xxxxxxxxxxxx
  Normal  Pulled     58s   kubelet            Container image "gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance" already present on machine
  Normal  Created    58s   kubelet            Created container init
  Normal  Started    58s   kubelet            Started container init
invalid configuration: no configuration has been provided, try setting KUBERNETES_MASTER environment variable

This error means you don't have a ServiceAccount properly mounted. It's not an Argo error, it's a k8s client error.

Your describe output shows that it is indeed missing a ServiceAccount. It's also missing a mount for /var/run/secrets/kubernetes.io/serviceaccount, hence the error is saying there is no k8s configuration.

  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.

Version

argoexec:v3.3.10

Image:         gcr.io/ml-pipeline/argoexec:v3.3.10-license-compliance

This is not :latest and you did not describe why you didn't use :latest. Please follow the issue template instructions, they are there for a reason.

Furthermore, v3.3 is an unsupported version of Argo.

And that is a Kubeflow forked image as well, not Argo's own official image. It is not maintained by Argo.