zifeo / terraform-openstack-rke2

Easily deploy a high-availability RKE2 Kubernetes cluster on OpenStack providers like Infomaniak.

Home Page:https://registry.terraform.io/modules/zifeo/rke2/openstack/latest

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cilium stuck waiting

hippiehunter opened this issue · comments

I've got a pretty basic Devstack setup with a single physical machine and after running a tweaked version of the sample in the readme, I'm seeing cilium pods in a restart loop.

cilium-5q62c                                            0/1     Running     18 (5m27s ago)   94m
cilium-7pxwc                                            0/1     Running     18 (5m15s ago)   94m
cilium-c4bbc                                            0/1     Running     18 (5m15s ago)   94m
cilium-operator-5499547db9-qb4vz                        0/1     Pending     0                94m
etcd-k8s-server-a-1                                     1/1     Running     0                94m
helm-install-openstack-cinder-csi-696hj                 0/1     Completed   0                94m
helm-install-openstack-cloud-controller-manager-jcsww   0/1     Completed   0                94m
helm-install-rke2-cilium-9k5pm                          0/1     Completed   0                94m
helm-install-rke2-coredns-wkp85                         0/1     Completed   0                94m
helm-install-rke2-metrics-server-dntgs                  0/1     Pending     0                94m
helm-install-rke2-snapshot-controller-crd-dfp7r         0/1     Pending     0                94m
helm-install-rke2-snapshot-controller-gmmwl             0/1     Pending     0                94m
helm-install-rke2-snapshot-validation-webhook-fhwnb     0/1     Pending     0                94m
helm-install-velero-96vzj                               0/1     Pending     0                94m
kube-apiserver-k8s-server-a-1                           1/1     Running     0                94m
kube-controller-manager-k8s-server-a-1                  1/1     Running     0                94m
kube-scheduler-k8s-server-a-1                           1/1     Running     0                94m
openstack-cinder-csi-controllerplugin-cf5f9869d-l4crj   0/6     Pending     0                94m
rke2-coredns-rke2-coredns-5697f484ff-k4xqd              0/1     Pending     0                94m
rke2-coredns-rke2-coredns-autoscaler-597fb897d7-vrvtg   0/1     Pending     0                94m

the last couple lines from the log for one of those pods looks like this

level=info msg="Inheriting MTU from external network interface" device=ens3 ipAddr=192.168.42.105 mtu=1442 subsys=mtu
level=info msg="Cgroup metadata manager is enabled" subsys=cgroup-manager
level=info msg="Envoy: Starting xDS gRPC server listening on /var/run/cilium/xds.sock" subsys=envoy-manager
level=info msg="Restored 0 node IDs from the BPF map" subsys=linux-datapath
level=info msg="Restored backends from maps" failedBackends=0 restoredBackends=0 subsys=service
level=info msg="Restored services from maps" failedServices=0 restoredServices=0 subsys=service
level=info msg="Reading old endpoints..." subsys=daemon
level=info msg="No old endpoints found." subsys=daemon
level=info msg="Waiting until all Cilium CRDs are available" subsys=k8s

I've tried with both rke2_version = "v1.25.3+rke2r1" and rke2_version = "v1.26.4+rke2r1" but with the same effect. Any ideas where I should look?

@hippiehunter It looks like the deployment of the operator is pending. Can you run k describe pod cilium-operator-5499547db9-qb4vz or/and share the cluster config here?

Thanks for looking at this, here's the output

Name:                 cilium-operator-5499547db9-76brl
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Service Account:      cilium-operator
Node:                 <none>
Labels:               app.kubernetes.io/name=cilium-operator
                      app.kubernetes.io/part-of=cilium
                      io.cilium/app=operator
                      name=cilium-operator
                      pod-template-hash=5499547db9
Annotations:          prometheus.io/port: 9963
                      prometheus.io/scrape: true
Status:               Pending
IP:
IPs:                  <none>
Controlled By:        ReplicaSet/cilium-operator-5499547db9
Containers:
  cilium-operator:
    Image:      rancher/mirrored-cilium-operator-aws:v1.13.0
    Port:       9963/TCP
    Host Port:  9963/TCP
    Command:
      cilium-operator-aws
    Args:
      --config-dir=/tmp/cilium/config-map
      --debug=$(CILIUM_DEBUG)
    Liveness:  http-get http://127.0.0.1:9234/healthz delay=60s timeout=3s period=10s #success=1 #failure=3
    Environment:
      K8S_NODE_NAME:             (v1:spec.nodeName)
      CILIUM_K8S_NAMESPACE:     kube-system (v1:metadata.namespace)
      CILIUM_DEBUG:             <set to the key 'debug' of config map 'cilium-config'>           Optional: true
      AWS_ACCESS_KEY_ID:        <set to the key 'AWS_ACCESS_KEY_ID' in secret 'cilium-aws'>      Optional: true
      AWS_SECRET_ACCESS_KEY:    <set to the key 'AWS_SECRET_ACCESS_KEY' in secret 'cilium-aws'>  Optional: true
      AWS_DEFAULT_REGION:       <set to the key 'AWS_DEFAULT_REGION' in secret 'cilium-aws'>     Optional: true
      KUBERNETES_SERVICE_HOST:  192.168.44.3
      KUBERNETES_SERVICE_PORT:  6443
    Mounts:
      /tmp/cilium/config-map from cilium-config-path (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zwcmg (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  cilium-config-path:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cilium-config
    Optional:  false
  kube-api-access-zwcmg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/os=linux
                             node-role.kubernetes.io/master=true
Tolerations:                 CriticalAddonsOnly:NoExecute op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  23m (x220 over 18h)  default-scheduler  0/3 nodes are available: 3 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..

@hippiehunter Great, it seems that the module was missing a toleration for launching cluster since we released the last major updates. Can you try re-applying with the version 2.0.2 of this module?

I seem to be getting an extra failure after upgrading to 2.0.2.

NAMESPACE     NAME                                                    READY   STATUS             RESTARTS      AGE
kube-system   cilium-b4b6l                                            0/1     Running            0             4m1s
kube-system   cilium-jkhhv                                            0/1     Running            0             4m1s
kube-system   cilium-operator-5499547db9-5hdl5                        0/1     Pending            0             4m1s
kube-system   cilium-zrzfn                                            0/1     Running            1 (2s ago)    4m1s
kube-system   etcd-k8s-server-a-1                                     1/1     Running            0             4m
kube-system   helm-install-openstack-cinder-csi-5pmz9                 0/1     Completed          0             4m19s
kube-system   helm-install-openstack-cloud-controller-manager-xjzp4   0/1     Completed          0             4m19s
kube-system   helm-install-rke2-cilium-s4522                          0/1     Completed          0             4m19s
kube-system   helm-install-rke2-coredns-2s9fh                         0/1     Completed          0             4m19s
kube-system   helm-install-rke2-metrics-server-r8wt9                  0/1     Pending            0             4m19s
kube-system   helm-install-rke2-snapshot-controller-crd-xs8cs         0/1     Pending            0             4m19s
kube-system   helm-install-rke2-snapshot-controller-dpsw2             0/1     Pending            0             4m19s
kube-system   helm-install-rke2-snapshot-validation-webhook-5zn9z     0/1     Pending            0             4m19s
kube-system   helm-install-velero-pzj6k                               0/1     Pending            0             4m19s
kube-system   kube-apiserver-k8s-server-a-1                           1/1     Running            0             4m19s
kube-system   kube-controller-manager-k8s-server-a-1                  1/1     Running            0             4m23s
kube-system   kube-scheduler-k8s-server-a-1                           1/1     Running            0             4m24s
kube-system   openstack-cinder-csi-controllerplugin-cf5f9869d-v269m   0/6     Pending            0             4m
kube-system   openstack-cloud-controller-manager-wqxd5                0/1     CrashLoopBackOff   3 (24s ago)   3m21s
kube-system   rke2-coredns-rke2-coredns-5697f484ff-xnvpc              0/1     Pending            0             4m3s
kube-system   rke2-coredns-rke2-coredns-autoscaler-597fb897d7-n24gf   0/1     Pending            0             4m3s

here's a describe from the openstack-cloud-controller-manager that's crashing now

Name:             openstack-cloud-controller-manager-wqxd5
Namespace:        kube-system
Priority:         0
Service Account:  openstack-cloud-controller-manager
Node:             k8s-server-a-1/192.168.42.123
Start Time:       Sat, 13 May 2023 09:31:43 -0700
Labels:           app=openstack-cloud-controller-manager
                  chart=openstack-cloud-controller-manager-1.4.0
                  component=controllermanager
                  controller-revision-hash=7b4ddc57d8
                  heritage=Helm
                  pod-template-generation=1
                  release=openstack-cloud-controller-manager
Annotations:      checksum/config: 2f2ff41ec3a3b7caf549c9fafa1adf7bb51431ad64573384fccbafb429f11fa6
Status:           Running
IP:               192.168.42.123
IPs:
  IP:           192.168.42.123
Controlled By:  DaemonSet/openstack-cloud-controller-manager
Containers:
  openstack-cloud-controller-manager:
    Container ID:  containerd://9b934faddc70df0f5ffe4a0415f84dcc6f29bd50764c64bad447db25c7c9deed
    Image:         docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0
    Image ID:      docker.io/k8scloudprovider/openstack-cloud-controller-manager@sha256:fffee05aab285856eb0b305adca8befeebf93b37035be219d5adfe75ef2c82d9
    Port:          <none>
    Host Port:     <none>
    Args:
      /bin/openstack-cloud-controller-manager
      --v=2
      --cloud-config=$(CLOUD_CONFIG)
      --cluster-name=$(CLUSTER_NAME)
      --cloud-provider=openstack
      --use-service-account-credentials=true
      --controllers=cloud-node,cloud-node-lifecycle,route,service
      --bind-address=127.0.0.1
      --use-service-account-credentials=false
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sat, 13 May 2023 09:35:23 -0700
      Finished:     Sat, 13 May 2023 09:35:54 -0700
    Ready:          False
    Restart Count:  4
    Requests:
      cpu:     50m
      memory:  64Mi
    Environment:
      CLOUD_CONFIG:  /etc/config/cloud.conf
      CLUSTER_NAME:  k8s
    Mounts:
      /etc/config from cloud-config-volume (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /usr/libexec/kubernetes/kubelet-plugins/volume/exec from flexvolume-dir (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x5gzv (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  cloud-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cloud-config
    Optional:    false
  flexvolume-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/libexec/kubernetes/kubelet-plugins/volume/exec
    HostPathType:
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:
  kube-api-access-x5gzv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              node-role.kubernetes.io/master=true
Tolerations:                 CriticalAddonsOnly:NoExecute op=Exists
                             node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  4m49s                default-scheduler  Successfully assigned kube-system/openstack-cloud-controller-manager-wqxd5 to k8s-server-a-1
  Normal   Pulling    4m49s                kubelet            Pulling image "docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0"
  Normal   Pulled     4m45s                kubelet            Successfully pulled image "docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0" in 4.326790668s (4.326835559s including waiting)
  Normal   Created    70s (x5 over 4m45s)  kubelet            Created container openstack-cloud-controller-manager
  Normal   Started    70s (x5 over 4m45s)  kubelet            Started container openstack-cloud-controller-manager
  Normal   Pulled     70s (x4 over 4m12s)  kubelet            Container image "docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0" already present on machine
  Warning  BackOff    15s (x9 over 3m39s)  kubelet            Back-off restarting failed container openstack-cloud-controller-manager in pod openstack-cloud-controller-manager-wqxd5_kube-system(196e50bb-575a-429e-a2fe-47cf23e3b4a2)

here's the describe from the cilium operator

Name:                 cilium-operator-5499547db9-5hdl5
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Service Account:      cilium-operator
Node:                 <none>
Labels:               app.kubernetes.io/name=cilium-operator
                      app.kubernetes.io/part-of=cilium
                      io.cilium/app=operator
                      name=cilium-operator
                      pod-template-hash=5499547db9
Annotations:          prometheus.io/port: 9963
                      prometheus.io/scrape: true
Status:               Pending
IP:
IPs:                  <none>
Controlled By:        ReplicaSet/cilium-operator-5499547db9
Containers:
  cilium-operator:
    Image:      rancher/mirrored-cilium-operator-aws:v1.13.0
    Port:       9963/TCP
    Host Port:  9963/TCP
    Command:
      cilium-operator-aws
    Args:
      --config-dir=/tmp/cilium/config-map
      --debug=$(CILIUM_DEBUG)
    Liveness:  http-get http://127.0.0.1:9234/healthz delay=60s timeout=3s period=10s #success=1 #failure=3
    Environment:
      K8S_NODE_NAME:             (v1:spec.nodeName)
      CILIUM_K8S_NAMESPACE:     kube-system (v1:metadata.namespace)
      CILIUM_DEBUG:             <set to the key 'debug' of config map 'cilium-config'>           Optional: true
      AWS_ACCESS_KEY_ID:        <set to the key 'AWS_ACCESS_KEY_ID' in secret 'cilium-aws'>      Optional: true
      AWS_SECRET_ACCESS_KEY:    <set to the key 'AWS_SECRET_ACCESS_KEY' in secret 'cilium-aws'>  Optional: true
      AWS_DEFAULT_REGION:       <set to the key 'AWS_DEFAULT_REGION' in secret 'cilium-aws'>     Optional: true
      KUBERNETES_SERVICE_HOST:  192.168.44.3
      KUBERNETES_SERVICE_PORT:  6443
    Mounts:
      /tmp/cilium/config-map from cilium-config-path (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-hpkmj (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  cilium-config-path:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cilium-config
    Optional:  false
  kube-api-access-hpkmj:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              kubernetes.io/os=linux
                             node-role.kubernetes.io/master=true
Tolerations:                 CriticalAddonsOnly:NoExecute op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  9m33s  default-scheduler  0/3 nodes are available: 3 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
  Warning  FailedScheduling  4m3s   default-scheduler  0/3 nodes are available: 3 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..

and this is the terraform I ran

{
  "//": {
    "metadata": {
      "backend": "local",
      "stackName": "cdkos",
      "version": "0.16.1"
    },
    "outputs": {
    }
  },
  "module": {
    "k8s": {
      "//": {
        "metadata": {
          "path": "cdkos/k8s",
          "uniqueId": "k8s"
        }
      },
      "agents": [
        {
          "boot_volume_size": 8,
          "flavor_name": "jeg1.large",
          "image_name": "ubuntu-jammy",
          "name": "pool-a",
          "nodes_count": 2,
          "rke2_version": "v1.26.4+rke2r1",
          "rke2_volume_size": 64,
          "system_user": "ubuntu"
        }
      ],
      "bootstrap": true,
      "ff_write_kubeconfig": true,
      "floating_pool": "public",
      "identity_endpoint": "http://192.168.1.150/identity",
      "name": "k8s",
      "object_store_endpoint": "http://192.168.1.150:8080/v1/AUTH_6c8ea23ef5374e4aace1ff0da81ce1b2",
      "providers": {
        "openstack": "openstack"
      },
      "rules_k8s_cidr": "0.0.0.0/0",
      "rules_ssh_cidr": "0.0.0.0/0",
      "servers": [
        {
          "boot_volume_size": 8,
          "flavor_name": "jeg1.large",
          "image_name": "ubuntu-jammy",
          "name": "server-a",
          "rke2_version": "v1.26.4+rke2r1",
          "rke2_volume_size": 64,
          "system_user": "ubuntu"
        }
      ],
      "source": "zifeo/rke2/openstack",
      "version": "~> 2.0.2"
    }
  },
  "provider": {
    "openstack": [
      {
        "auth_url": "http://192.168.1.150/identity",
        "enable_logging": true,
        "insecure": true,
        "password": "supersecretpasswordstuffhere",
        "project_domain_id": "31698c34f2f34b4f87b08837e2b71dc5",
        "region": "RegionOne",
        "tenant_id": "31698c34f2f34b4f87b08837e2b71dc5",
        "user_domain_name": "Default",
        "user_name": "admin"
      }
    ]
  },
  "terraform": {
    "backend": {
      "local": {
        "path": "/home/hh/cdkos/terraform.cdkos.tfstate"
      }
    },
    "required_providers": {
      "openstack": {
        "source": "terraform-provider-openstack/openstack",
        "version": "1.49.0"
      }
    }
  }
}

@hippiehunter Can you send the log of openstack-cloud-controller-manager-wqxd5 and its description?

kubectl describe pod -n kube-system openstack-cloud-controller-manager-wqxd5
Name:             openstack-cloud-controller-manager-wqxd5
Namespace:        kube-system
Priority:         0
Service Account:  openstack-cloud-controller-manager
Node:             k8s-server-a-1/192.168.42.123
Start Time:       Sat, 13 May 2023 09:31:43 -0700
Labels:           app=openstack-cloud-controller-manager
                  chart=openstack-cloud-controller-manager-1.4.0
                  component=controllermanager
                  controller-revision-hash=7b4ddc57d8
                  heritage=Helm
                  pod-template-generation=1
                  release=openstack-cloud-controller-manager
Annotations:      checksum/config: 2f2ff41ec3a3b7caf549c9fafa1adf7bb51431ad64573384fccbafb429f11fa6
Status:           Running
IP:               192.168.42.123
IPs:
  IP:           192.168.42.123
Controlled By:  DaemonSet/openstack-cloud-controller-manager
Containers:
  openstack-cloud-controller-manager:
    Container ID:  containerd://e0e12818c527491ca0a64673b792020663cad59d0af0145c724b6e324476e456
    Image:         docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0
    Image ID:      docker.io/k8scloudprovider/openstack-cloud-controller-manager@sha256:fffee05aab285856eb0b305adca8befeebf93b37035be219d5adfe75ef2c82d9
    Port:          <none>
    Host Port:     <none>
    Args:
      /bin/openstack-cloud-controller-manager
      --v=2
      --cloud-config=$(CLOUD_CONFIG)
      --cluster-name=$(CLUSTER_NAME)
      --cloud-provider=openstack
      --use-service-account-credentials=true
      --controllers=cloud-node,cloud-node-lifecycle,route,service
      --bind-address=127.0.0.1
      --use-service-account-credentials=false
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sat, 13 May 2023 10:25:34 -0700
      Finished:     Sat, 13 May 2023 10:26:05 -0700
    Ready:          False
    Restart Count:  14
    Requests:
      cpu:     50m
      memory:  64Mi
    Environment:
      CLOUD_CONFIG:  /etc/config/cloud.conf
      CLUSTER_NAME:  k8s
    Mounts:
      /etc/config from cloud-config-volume (ro)
      /etc/kubernetes/pki from k8s-certs (ro)
      /usr/libexec/kubernetes/kubelet-plugins/volume/exec from flexvolume-dir (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-x5gzv (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  cloud-config-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cloud-config
    Optional:    false
  flexvolume-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /usr/libexec/kubernetes/kubelet-plugins/volume/exec
    HostPathType:
  k8s-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/pki
    HostPathType:
  kube-api-access-x5gzv:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              node-role.kubernetes.io/master=true
Tolerations:                 CriticalAddonsOnly:NoExecute op=Exists
                             node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  58m                    default-scheduler  Successfully assigned kube-system/openstack-cloud-controller-manager-wqxd5 to k8s-server-a-1
  Normal   Pulling    58m                    kubelet            Pulling image "docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0"
  Normal   Pulled     58m                    kubelet            Successfully pulled image "docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0" in 4.326790668s (4.326835559s including waiting)
  Normal   Created    54m (x5 over 58m)      kubelet            Created container openstack-cloud-controller-manager
  Normal   Started    54m (x5 over 58m)      kubelet            Started container openstack-cloud-controller-manager
  Normal   Pulled     38m (x8 over 57m)      kubelet            Container image "docker.io/k8scloudprovider/openstack-cloud-controller-manager:v1.25.0" already present on machine
  Warning  BackOff    3m20s (x215 over 57m)  kubelet            Back-off restarting failed container openstack-cloud-controller-manager in pod openstack-cloud-controller-manager-wqxd5_kube-system(196e50bb-575a-429e-a2fe-47cf23e3b4a2)
kubectl logs -n kube-system openstack-cloud-controller-manager-wqxd5
I0513 17:25:34.493818       1 flags.go:64] FLAG: --add_dir_header="false"
I0513 17:25:34.493951       1 flags.go:64] FLAG: --allocate-node-cidrs="false"
I0513 17:25:34.493967       1 flags.go:64] FLAG: --allow-untagged-cloud="false"
I0513 17:25:34.493977       1 flags.go:64] FLAG: --alsologtostderr="false"
I0513 17:25:34.493994       1 flags.go:64] FLAG: --authentication-kubeconfig=""
I0513 17:25:34.494008       1 flags.go:64] FLAG: --authentication-skip-lookup="false"
I0513 17:25:34.494018       1 flags.go:64] FLAG: --authentication-token-webhook-cache-ttl="10s"
I0513 17:25:34.494029       1 flags.go:64] FLAG: --authentication-tolerate-lookup-failure="false"
I0513 17:25:34.494039       1 flags.go:64] FLAG: --authorization-always-allow-paths="[/healthz,/readyz,/livez]"
I0513 17:25:34.494070       1 flags.go:64] FLAG: --authorization-kubeconfig=""
I0513 17:25:34.494082       1 flags.go:64] FLAG: --authorization-webhook-cache-authorized-ttl="10s"
I0513 17:25:34.494091       1 flags.go:64] FLAG: --authorization-webhook-cache-unauthorized-ttl="10s"
I0513 17:25:34.494099       1 flags.go:64] FLAG: --bind-address="127.0.0.1"
I0513 17:25:34.494108       1 flags.go:64] FLAG: --cert-dir=""
I0513 17:25:34.494115       1 flags.go:64] FLAG: --cidr-allocator-type="RangeAllocator"
I0513 17:25:34.494123       1 flags.go:64] FLAG: --client-ca-file=""
I0513 17:25:34.494130       1 flags.go:64] FLAG: --cloud-config="/etc/config/cloud.conf"
I0513 17:25:34.494138       1 flags.go:64] FLAG: --cloud-provider="openstack"
I0513 17:25:34.494145       1 flags.go:64] FLAG: --cluster-cidr=""
I0513 17:25:34.494153       1 flags.go:64] FLAG: --cluster-name="k8s"
I0513 17:25:34.494160       1 flags.go:64] FLAG: --concurrent-service-syncs="1"
I0513 17:25:34.494170       1 flags.go:64] FLAG: --configure-cloud-routes="true"
I0513 17:25:34.494177       1 flags.go:64] FLAG: --contention-profiling="false"
I0513 17:25:34.494184       1 flags.go:64] FLAG: --controller-start-interval="0s"
I0513 17:25:34.494191       1 flags.go:64] FLAG: --controllers="[cloud-node,cloud-node-lifecycle,route,service]"
I0513 17:25:34.494217       1 flags.go:64] FLAG: --enable-leader-migration="false"
I0513 17:25:34.494225       1 flags.go:64] FLAG: --external-cloud-volume-plugin=""
I0513 17:25:34.494233       1 flags.go:64] FLAG: --feature-gates=""
I0513 17:25:34.494243       1 flags.go:64] FLAG: --help="false"
I0513 17:25:34.494250       1 flags.go:64] FLAG: --http2-max-streams-per-connection="0"
I0513 17:25:34.494260       1 flags.go:64] FLAG: --kube-api-burst="30"
I0513 17:25:34.494268       1 flags.go:64] FLAG: --kube-api-content-type="application/vnd.kubernetes.protobuf"
I0513 17:25:34.494276       1 flags.go:64] FLAG: --kube-api-qps="20"
I0513 17:25:34.494291       1 flags.go:64] FLAG: --kubeconfig=""
I0513 17:25:34.494299       1 flags.go:64] FLAG: --leader-elect="true"
I0513 17:25:34.494306       1 flags.go:64] FLAG: --leader-elect-lease-duration="15s"
I0513 17:25:34.494313       1 flags.go:64] FLAG: --leader-elect-renew-deadline="10s"
I0513 17:25:34.494320       1 flags.go:64] FLAG: --leader-elect-resource-lock="leases"
I0513 17:25:34.494328       1 flags.go:64] FLAG: --leader-elect-resource-name="cloud-controller-manager"
I0513 17:25:34.494335       1 flags.go:64] FLAG: --leader-elect-resource-namespace="kube-system"
I0513 17:25:34.494343       1 flags.go:64] FLAG: --leader-elect-retry-period="2s"
I0513 17:25:34.494350       1 flags.go:64] FLAG: --leader-migration-config=""
I0513 17:25:34.494357       1 flags.go:64] FLAG: --log-flush-frequency="5s"
I0513 17:25:34.494364       1 flags.go:64] FLAG: --log_backtrace_at=":0"
I0513 17:25:34.494374       1 flags.go:64] FLAG: --log_dir=""
I0513 17:25:34.494382       1 flags.go:64] FLAG: --log_file=""
I0513 17:25:34.494389       1 flags.go:64] FLAG: --log_file_max_size="1800"
I0513 17:25:34.494397       1 flags.go:64] FLAG: --logtostderr="true"
I0513 17:25:34.494405       1 flags.go:64] FLAG: --master=""
I0513 17:25:34.494412       1 flags.go:64] FLAG: --min-resync-period="12h0m0s"
I0513 17:25:34.494420       1 flags.go:64] FLAG: --node-monitor-period="5s"
I0513 17:25:34.494427       1 flags.go:64] FLAG: --node-status-update-frequency="5m0s"
I0513 17:25:34.494434       1 flags.go:64] FLAG: --node-sync-period="0s"
I0513 17:25:34.494441       1 flags.go:64] FLAG: --one_output="false"
I0513 17:25:34.494449       1 flags.go:64] FLAG: --permit-address-sharing="false"
I0513 17:25:34.494456       1 flags.go:64] FLAG: --permit-port-sharing="false"
I0513 17:25:34.494463       1 flags.go:64] FLAG: --profiling="true"
I0513 17:25:34.494470       1 flags.go:64] FLAG: --requestheader-allowed-names="[]"
I0513 17:25:34.494492       1 flags.go:64] FLAG: --requestheader-client-ca-file=""
I0513 17:25:34.494500       1 flags.go:64] FLAG: --requestheader-extra-headers-prefix="[x-remote-extra-]"
I0513 17:25:34.494513       1 flags.go:64] FLAG: --requestheader-group-headers="[x-remote-group]"
I0513 17:25:34.494533       1 flags.go:64] FLAG: --requestheader-username-headers="[x-remote-user]"
I0513 17:25:34.494544       1 flags.go:64] FLAG: --route-reconciliation-period="10s"
I0513 17:25:34.494552       1 flags.go:64] FLAG: --secure-port="10258"
I0513 17:25:34.494559       1 flags.go:64] FLAG: --skip_headers="false"
I0513 17:25:34.494567       1 flags.go:64] FLAG: --skip_log_headers="false"
I0513 17:25:34.494574       1 flags.go:64] FLAG: --stderrthreshold="2"
I0513 17:25:34.494582       1 flags.go:64] FLAG: --tls-cert-file=""
I0513 17:25:34.494589       1 flags.go:64] FLAG: --tls-cipher-suites="[]"
I0513 17:25:34.494598       1 flags.go:64] FLAG: --tls-min-version=""
I0513 17:25:34.494605       1 flags.go:64] FLAG: --tls-private-key-file=""
I0513 17:25:34.494612       1 flags.go:64] FLAG: --tls-sni-cert-key="[]"
I0513 17:25:34.494627       1 flags.go:64] FLAG: --use-service-account-credentials="false"
I0513 17:25:34.494635       1 flags.go:64] FLAG: --user-agent="[]"
I0513 17:25:34.494654       1 flags.go:64] FLAG: --v="2"
I0513 17:25:34.494661       1 flags.go:64] FLAG: --version="false"
I0513 17:25:34.494671       1 flags.go:64] FLAG: --vmodule=""
I0513 17:25:35.310412       1 serving.go:348] Generated self-signed cert in-memory
unable to load configmap based request-header-client-ca-file: Get "https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 10.43.0.1:443: i/o timeout
Error: unable to load configmap based request-header-client-ca-file: Get "https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 10.43.0.1:443: i/o timeout
Usage:
  cloud-controller-manager [flags]

Debugging flags:

      --contention-profiling   Enable lock contention profiling, if profiling is enabled
      --profiling              Enable profiling via web interface host:port/debug/pprof/ (default true)

Leader-migration flags:

      --enable-leader-migration          Whether to enable controller leader migration.
      --leader-migration-config string   Path to the config file for controller leader migration, or empty to use the value that reflects default configuration of the controller manager. The config file should be of type LeaderMigrationConfiguration, group controllermanager.config.k8s.io, version v1alpha1.

Generic flags:

      --allocate-node-cidrs                      Should CIDRs for Pods be allocated and set on the cloud provider.
      --cidr-allocator-type string               Type of CIDR allocator to use (default "RangeAllocator")
      --cloud-config string                      The path to the cloud provider configuration file. Empty string for no configuration file.
      --cloud-provider string                    The provider for cloud services. Empty string for no provider.
      --cluster-cidr string                      CIDR Range for Pods in cluster. Requires --allocate-node-cidrs to be true
      --cluster-name string                      The instance prefix for the cluster. (default "kubernetes")
      --configure-cloud-routes                   Should CIDRs allocated by allocate-node-cidrs be configured on the cloud provider. (default true)
      --controller-start-interval duration       Interval between starting controller managers.
      --controllers strings                      A list of controllers to enable. '*' enables all on-by-default controllers, 'foo' enables the controller named 'foo', '-foo' disables the controller named 'foo'.
                                                 All controllers: cloud-node, cloud-node-lifecycle, route, service
                                                 Disabled-by-default controllers:  (default [*])
      --external-cloud-volume-plugin string      The plugin to use when cloud provider is set to external. Can be empty, should only be set when cloud-provider is external. Currently used to allow node and volume controllers to work for in tree cloud providers.
      --feature-gates mapStringBool              A set of key=value pairs that describe feature gates for alpha/experimental features. Options are:
                                                 APIListChunking=true|false (BETA - default=true)
                                                 APIPriorityAndFairness=true|false (BETA - default=true)
                                                 APIResponseCompression=true|false (BETA - default=true)
                                                 APIServerIdentity=true|false (ALPHA - default=false)
                                                 APIServerTracing=true|false (ALPHA - default=false)
                                                 AllAlpha=true|false (ALPHA - default=false)
                                                 AllBeta=true|false (BETA - default=false)
                                                 AnyVolumeDataSource=true|false (BETA - default=true)
                                                 AppArmor=true|false (BETA - default=true)
                                                 CPUManager=true|false (BETA - default=true)
                                                 CPUManagerPolicyAlphaOptions=true|false (ALPHA - default=false)
                                                 CPUManagerPolicyBetaOptions=true|false (BETA - default=true)
                                                 CPUManagerPolicyOptions=true|false (BETA - default=true)
                                                 CSIMigrationAzureFile=true|false (BETA - default=true)
                                                 CSIMigrationPortworx=true|false (BETA - default=false)
                                                 CSIMigrationRBD=true|false (ALPHA - default=false)
                                                 CSIMigrationvSphere=true|false (BETA - default=true)
                                                 CSINodeExpandSecret=true|false (ALPHA - default=false)
                                                 CSIVolumeHealth=true|false (ALPHA - default=false)
                                                 ContainerCheckpoint=true|false (ALPHA - default=false)
                                                 CronJobTimeZone=true|false (BETA - default=true)
                                                 CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)
                                                 CustomResourceValidationExpressions=true|false (BETA - default=true)
                                                 DelegateFSGroupToCSIDriver=true|false (BETA - default=true)
                                                 DevicePlugins=true|false (BETA - default=true)
                                                 DisableCloudProviders=true|false (ALPHA - default=false)
                                                 DisableKubeletCloudCredentialProviders=true|false (ALPHA - default=false)
                                                 DownwardAPIHugePages=true|false (BETA - default=true)
                                                 EndpointSliceTerminatingCondition=true|false (BETA - default=true)
                                                 ExpandedDNSConfig=true|false (ALPHA - default=false)
                                                 ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)
                                                 GRPCContainerProbe=true|false (BETA - default=true)
                                                 GracefulNodeShutdown=true|false (BETA - default=true)
                                                 GracefulNodeShutdownBasedOnPodPriority=true|false (BETA - default=true)
                                                 HPAContainerMetrics=true|false (ALPHA - default=false)
                                                 HPAScaleToZero=true|false (ALPHA - default=false)
                                                 HonorPVReclaimPolicy=true|false (ALPHA - default=false)
                                                 IPTablesOwnershipCleanup=true|false (ALPHA - default=false)
                                                 InTreePluginAWSUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginAzureDiskUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginAzureFileUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginGCEUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginOpenStackUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginPortworxUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginRBDUnregister=true|false (ALPHA - default=false)
                                                 InTreePluginvSphereUnregister=true|false (ALPHA - default=false)
                                                 JobMutableNodeSchedulingDirectives=true|false (BETA - default=true)
                                                 JobPodFailurePolicy=true|false (ALPHA - default=false)
                                                 JobReadyPods=true|false (BETA - default=true)
                                                 JobTrackingWithFinalizers=true|false (BETA - default=true)
                                                 KMSv2=true|false (ALPHA - default=false)
                                                 KubeletCredentialProviders=true|false (BETA - default=true)
                                                 KubeletInUserNamespace=true|false (ALPHA - default=false)
                                                 KubeletPodResources=true|false (BETA - default=true)
                                                 KubeletPodResourcesGetAllocatable=true|false (BETA - default=true)
                                                 KubeletTracing=true|false (ALPHA - default=false)
                                                 LegacyServiceAccountTokenNoAutoGeneration=true|false (BETA - default=true)
                                                 LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (BETA - default=true)
                                                 LogarithmicScaleDown=true|false (BETA - default=true)
                                                 MatchLabelKeysInPodTopologySpread=true|false (ALPHA - default=false)
                                                 MaxUnavailableStatefulSet=true|false (ALPHA - default=false)
                                                 MemoryManager=true|false (BETA - default=true)
                                                 MemoryQoS=true|false (ALPHA - default=false)
                                                 MinDomainsInPodTopologySpread=true|false (BETA - default=false)
                                                 MixedProtocolLBService=true|false (BETA - default=true)
                                                 MultiCIDRRangeAllocator=true|false (ALPHA - default=false)
                                                 NetworkPolicyStatus=true|false (ALPHA - default=false)
                                                 NodeInclusionPolicyInPodTopologySpread=true|false (ALPHA - default=false)
                                                 NodeOutOfServiceVolumeDetach=true|false (ALPHA - default=false)
                                                 NodeSwap=true|false (ALPHA - default=false)
                                                 OpenAPIEnums=true|false (BETA - default=true)
                                                 OpenAPIV3=true|false (BETA - default=true)
                                                 PodAndContainerStatsFromCRI=true|false (ALPHA - default=false)
                                                 PodDeletionCost=true|false (BETA - default=true)
                                                 PodDisruptionConditions=true|false (ALPHA - default=false)
                                                 PodHasNetworkCondition=true|false (ALPHA - default=false)
                                                 ProbeTerminationGracePeriod=true|false (BETA - default=true)
                                                 ProcMountType=true|false (ALPHA - default=false)
                                                 ProxyTerminatingEndpoints=true|false (ALPHA - default=false)
                                                 QOSReserved=true|false (ALPHA - default=false)
                                                 ReadWriteOncePod=true|false (ALPHA - default=false)
                                                 RecoverVolumeExpansionFailure=true|false (ALPHA - default=false)
                                                 RemainingItemCount=true|false (BETA - default=true)
                                                 RetroactiveDefaultStorageClass=true|false (ALPHA - default=false)
                                                 RotateKubeletServerCertificate=true|false (BETA - default=true)
                                                 SELinuxMountReadWriteOncePod=true|false (ALPHA - default=false)
                                                 SeccompDefault=true|false (BETA - default=true)
                                                 ServerSideFieldValidation=true|false (BETA - default=true)
                                                 ServiceIPStaticSubrange=true|false (BETA - default=true)
                                                 ServiceInternalTrafficPolicy=true|false (BETA - default=true)
                                                 SizeMemoryBackedVolumes=true|false (BETA - default=true)
                                                 StatefulSetAutoDeletePVC=true|false (ALPHA - default=false)
                                                 StorageVersionAPI=true|false (ALPHA - default=false)
                                                 StorageVersionHash=true|false (BETA - default=true)
                                                 TopologyAwareHints=true|false (BETA - default=true)
                                                 TopologyManager=true|false (BETA - default=true)
                                                 UserNamespacesStatelessPodsSupport=true|false (ALPHA - default=false)
                                                 VolumeCapacityPriority=true|false (ALPHA - default=false)
                                                 WinDSR=true|false (ALPHA - default=false)
                                                 WinOverlay=true|false (BETA - default=true)
                                                 WindowsHostProcessContainers=true|false (BETA - default=true)
      --kube-api-burst int32                     Burst to use while talking with kubernetes apiserver. (default 30)
      --kube-api-content-type string             Content type of requests sent to apiserver. (default "application/vnd.kubernetes.protobuf")
      --kube-api-qps float32                     QPS to use while talking with kubernetes apiserver. (default 20)
      --leader-elect                             Start a leader election client and gain leadership before executing the main loop. Enable this when running replicated components for high availability. (default true)
      --leader-elect-lease-duration duration     The duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of a led but unrenewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. (default 15s)
      --leader-elect-renew-deadline duration     The interval between attempts by the acting master to renew a leadership slot before it stops leading. This must be less than or equal to the lease duration. This is only applicable if leader election is enabled. (default 10s)
      --leader-elect-resource-lock string        The type of resource object that is used for locking during leader election. Supported options are 'leases', 'endpointsleases' and 'configmapsleases'. (default "leases")
      --leader-elect-resource-name string        The name of resource object that is used for locking during leader election. (default "cloud-controller-manager")
      --leader-elect-resource-namespace string   The namespace of resource object that is used for locking during leader election. (default "kube-system")
      --leader-elect-retry-period duration       The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. (default 2s)
      --min-resync-period duration               The resync period in reflectors will be random between MinResyncPeriod and 2*MinResyncPeriod. (default 12h0m0s)
      --node-monitor-period duration             The period for syncing NodeStatus in NodeController. (default 5s)
      --route-reconciliation-period duration     The period for reconciling routes created for Nodes by cloud provider. (default 10s)
      --use-service-account-credentials          If true, use individual service account credentials for each controller.

Service controller flags:

      --concurrent-service-syncs int32   The number of services that are allowed to sync concurrently. Larger number = more responsive service management, but more CPU (and network) load (default 1)

Secure serving flags:

      --bind-address ip                        The IP address on which to listen for the --secure-port port. The associated interface(s) must be reachable by the rest of the cluster, and by CLI/web clients. If blank or an unspecified address (0.0.0.0 or ::), all interfaces will be used. (default 0.0.0.0)
      --cert-dir string                        The directory where the TLS certs are located. If --tls-cert-file and --tls-private-key-file are provided, this flag will be ignored.
      --http2-max-streams-per-connection int   The limit that the server gives to clients for the maximum number of streams in an HTTP/2 connection. Zero means to use golang's default.
      --permit-address-sharing                 If true, SO_REUSEADDR will be used when binding the port. This allows binding to wildcard IPs like 0.0.0.0 and specific IPs in parallel, and it avoids waiting for the kernel to release sockets in TIME_WAIT state. [default=false]
      --permit-port-sharing                    If true, SO_REUSEPORT will be used when binding the port, which allows more than one instance to bind on the same address and port. [default=false]
      --secure-port int                        The port on which to serve HTTPS with authentication and authorization. If 0, don't serve HTTPS at all. (default 10258)
      --tls-cert-file string                   File containing the default x509 Certificate for HTTPS. (CA cert, if any, concatenated after server cert). If HTTPS serving is enabled, and --tls-cert-file and --tls-private-key-file are not provided, a self-signed certificate and key are generated for the public address and saved to the directory specified by --cert-dir.
      --tls-cipher-suites strings              Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be used.
                                               Preferred values: TLS_AES_128_GCM_SHA256, TLS_AES_256_GCM_SHA384, TLS_CHACHA20_POLY1305_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_256_GCM_SHA384.
                                               Insecure values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_RC4_128_SHA, TLS_RSA_WITH_3DES_EDE_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_RC4_128_SHA.
      --tls-min-version string                 Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13
      --tls-private-key-file string            File containing the default x509 private key matching --tls-cert-file.
      --tls-sni-cert-key namedCertKey          A pair of x509 certificate and private key file paths, optionally suffixed with a list of domain patterns which are fully qualified domain names, possibly with prefixed wildcard segments. The domain patterns also allow IP addresses, but IPs should only be used if the apiserver has visibility to the IP address requested by a client. If no domain patterns are provided, the names of the certificate are extracted. Non-wildcard matches trump over wildcard matches, explicit domain patterns trump over extracted names. For multiple key/certificate pairs, use the --tls-sni-cert-key multiple times. Examples: "example.crt,example.key" or "foo.crt,foo.key:*.foo.com,foo.com". (default [])

Authentication flags:

      --authentication-kubeconfig string                  kubeconfig file pointing at the 'core' kubernetes server with enough rights to create tokenreviews.authentication.k8s.io. This is optional. If empty, all token requests are considered to be anonymous and no client CA is looked up in the cluster.
      --authentication-skip-lookup                        If false, the authentication-kubeconfig will be used to lookup missing authentication configuration from the cluster.
      --authentication-token-webhook-cache-ttl duration   The duration to cache responses from the webhook token authenticator. (default 10s)
      --authentication-tolerate-lookup-failure            If true, failures to look up missing authentication configuration from the cluster are not considered fatal. Note that this can result in authentication that treats all requests as anonymous.
      --client-ca-file string                             If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the CommonName of the client certificate.
      --requestheader-allowed-names strings               List of client certificate common names to allow to provide usernames in headers specified by --requestheader-username-headers. If empty, any client certificate validated by the authorities in --requestheader-client-ca-file is allowed.
      --requestheader-client-ca-file string               Root certificate bundle to use to verify client certificates on incoming requests before trusting usernames in headers specified by --requestheader-username-headers. WARNING: generally do not depend on authorization being already done for incoming requests.
      --requestheader-extra-headers-prefix strings        List of request header prefixes to inspect. X-Remote-Extra- is suggested. (default [x-remote-extra-])
      --requestheader-group-headers strings               List of request headers to inspect for groups. X-Remote-Group is suggested. (default [x-remote-group])
      --requestheader-username-headers strings            List of request headers to inspect for usernames. X-Remote-User is common. (default [x-remote-user])

Authorization flags:

      --authorization-always-allow-paths strings                A list of HTTP paths to skip during authorization, i.e. these are authorized without contacting the 'core' kubernetes server. (default [/healthz,/readyz,/livez])
      --authorization-kubeconfig string                         kubeconfig file pointing at the 'core' kubernetes server with enough rights to create subjectaccessreviews.authorization.k8s.io. This is optional. If empty, all requests not skipped by authorization are forbidden.
      --authorization-webhook-cache-authorized-ttl duration     The duration to cache 'authorized' responses from the webhook authorizer. (default 10s)
      --authorization-webhook-cache-unauthorized-ttl duration   The duration to cache 'unauthorized' responses from the webhook authorizer. (default 10s)

Misc flags:

      --kubeconfig string                       Path to kubeconfig file with authorization and master location information.
      --master string                           The address of the Kubernetes API server (overrides any value in kubeconfig).
      --node-status-update-frequency duration   Specifies how often the controller updates nodes' status. (default 5m0s)

Global flags:

      --add_dir_header                   If true, adds the file directory to the header of the log messages (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --alsologtostderr                  log to standard error as well as files (no effect when -logtostderr=true) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
  -h, --help                             help for cloud-controller-manager
      --log-flush-frequency duration     Maximum number of seconds between log flushes (default 5s)
      --log_backtrace_at traceLocation   when logging hits line file:N, emit a stack trace (default :0) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --log_dir string                   If non-empty, write log files in this directory (no effect when -logtostderr=true) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --log_file string                  If non-empty, use this log file (no effect when -logtostderr=true) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --log_file_max_size uint           Defines the maximum size a log file can grow to (no effect when -logtostderr=true). Unit is megabytes. If the value is 0, the maximum file size is unlimited. (default 1800) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --logtostderr                      log to standard error instead of files (default true) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --one_output                       If true, only write logs to their native severity level (vs also writing to each lower severity level; no effect when -logtostderr=true) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --skip_headers                     If true, avoid header prefixes in the log messages (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --skip_log_headers                 If true, avoid headers when opening log files (no effect when -logtostderr=true) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
      --stderrthreshold severity         logs at or above this threshold go to stderr when writing to files and stderr (no effect when -logtostderr=true or -alsologtostderr=false) (default 2) (DEPRECATED: will be removed in a future release, see https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components)
  -v, --v Level                          number for the log level verbosity (default 0)
      --version version[=true]           Print version information and quit
      --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging

error: unable to load configmap based request-header-client-ca-file: Get "https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 10.43.0.1:443: i/o timeout

Unfortunately, I have the exact same issue with version 2.0.2 with this module:

Get "https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication": dial tcp 10.43.0.1:443: i/o timeout

@imscaradh @hippiehunter I finally found the time to reproduce the issue. I am mostly using HA setup where the issue did not exist. v2.0.3 should solve your single node setup. Let me know if anything still blocks.

@zifeo thanks for investigating on this. Unfortunately I am facing another error during startup of the Cilium agents:

level=info msg="Memory available for map entries (0.003% of 8323051520B): 20807628B" subsys=config
level=info msg="option bpf-ct-global-tcp-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-ct-global-any-max set by dynamic sizing to 65536" subsys=config
level=info msg="option bpf-nat-global-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-neigh-global-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-sock-rev-map-max set by dynamic sizing to 65536" subsys=config
level=info msg="  --agent-health-port='9879'" subsys=daemon
level=info msg="  --agent-labels=''" subsys=daemon
level=info msg="  --agent-not-ready-taint-key='node.cilium.io/agent-not-ready'" subsys=daemon
level=info msg="  --allocator-list-timeout='3m0s'" subsys=daemon
level=info msg="  --allow-icmp-frag-needed='true'" subsys=daemon
level=info msg="  --allow-localhost='auto'" subsys=daemon
level=info msg="  --annotate-k8s-node='false'" subsys=daemon
level=info msg="  --api-rate-limit=''" subsys=daemon
level=info msg="  --arping-refresh-period='30s'" subsys=daemon
level=info msg="  --auto-create-cilium-node-resource='true'" subsys=daemon
level=info msg="  --auto-direct-node-routes='false'" subsys=daemon
level=info msg="  --bgp-announce-lb-ip='false'" subsys=daemon
level=info msg="  --bgp-announce-pod-cidr='false'" subsys=daemon
level=info msg="  --bgp-config-path='/var/lib/cilium/bgp/config.yaml'" subsys=daemon
level=info msg="  --bpf-ct-global-any-max='262144'" subsys=daemon
level=info msg="  --bpf-ct-global-tcp-max='524288'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-any='1m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-tcp='6h0m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-tcp-fin='10s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-tcp-syn='1m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-service-any='1m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-service-tcp='6h0m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-service-tcp-grace='1m0s'" subsys=daemon
level=info msg="  --bpf-filter-priority='1'" subsys=daemon
level=info msg="  --bpf-fragments-map-max='8192'" subsys=daemon
level=info msg="  --bpf-lb-acceleration='disabled'" subsys=daemon
level=info msg="  --bpf-lb-affinity-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-algorithm='random'" subsys=daemon
level=info msg="  --bpf-lb-dev-ip-addr-inherit=''" subsys=daemon
level=info msg="  --bpf-lb-dsr-dispatch='opt'" subsys=daemon
level=info msg="  --bpf-lb-dsr-l4-xlate='frontend'" subsys=daemon
level=info msg="  --bpf-lb-external-clusterip='false'" subsys=daemon
level=info msg="  --bpf-lb-maglev-hash-seed='JLfvgnHc2kaSUFaI'" subsys=daemon
level=info msg="  --bpf-lb-maglev-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-maglev-table-size='16381'" subsys=daemon
level=info msg="  --bpf-lb-map-max='65536'" subsys=daemon
level=info msg="  --bpf-lb-mode='snat'" subsys=daemon
level=info msg="  --bpf-lb-rev-nat-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-rss-ipv4-src-cidr=''" subsys=daemon
level=info msg="  --bpf-lb-rss-ipv6-src-cidr=''" subsys=daemon
level=info msg="  --bpf-lb-service-backend-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-service-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-sock='false'" subsys=daemon
level=info msg="  --bpf-lb-sock-hostns-only='false'" subsys=daemon
level=info msg="  --bpf-lb-source-range-map-max='0'" subsys=daemon
level=info msg="  --bpf-map-dynamic-size-ratio='0.0025'" subsys=daemon
level=info msg="  --bpf-map-event-buffers=''" subsys=daemon
level=info msg="  --bpf-nat-global-max='524288'" subsys=daemon
level=info msg="  --bpf-neigh-global-max='524288'" subsys=daemon
level=info msg="  --bpf-policy-map-max='16384'" subsys=daemon
level=info msg="  --bpf-root='/sys/fs/bpf'" subsys=daemon
level=info msg="  --bpf-sock-rev-map-max='262144'" subsys=daemon
level=info msg="  --bypass-ip-availability-upon-restore='false'" subsys=daemon
level=info msg="  --certificates-directory='/var/run/cilium/certs'" subsys=daemon
level=info msg="  --cflags=''" subsys=daemon
level=info msg="  --cgroup-root='/run/cilium/cgroupv2'" subsys=daemon
level=info msg="  --cilium-endpoint-gc-interval='5m0s'" subsys=daemon
level=info msg="  --cluster-health-port='4240'" subsys=daemon
level=info msg="  --cluster-id='0'" subsys=daemon
level=info msg="  --cluster-name='ff-staging-openstack'" subsys=daemon
level=info msg="  --clustermesh-config='/var/lib/cilium/clustermesh/'" subsys=daemon
level=info msg="  --cmdref=''" subsys=daemon
level=info msg="  --cni-chaining-mode=''" subsys=daemon
level=info msg="  --cni-uninstall='true'" subsys=daemon
level=info msg="  --config=''" subsys=daemon
level=info msg="  --config-dir='/tmp/cilium/config-map'" subsys=daemon
level=info msg="  --config-sources='config-map:kube-system/cilium-config'" subsys=daemon
level=info msg="  --conntrack-gc-interval='0s'" subsys=daemon
level=info msg="  --crd-wait-timeout='5m0s'" subsys=daemon
level=info msg="  --custom-cni-conf='false'" subsys=daemon
level=info msg="  --datapath-mode='veth'" subsys=daemon
level=info msg="  --debug='false'" subsys=daemon
level=info msg="  --debug-verbose=''" subsys=daemon
level=info msg="  --derive-masquerade-ip-addr-from-device=''" subsys=daemon
level=info msg="  --devices=''" subsys=daemon
level=info msg="  --direct-routing-device=''" subsys=daemon
level=info msg="  --disable-cnp-status-updates='true'" subsys=daemon
level=info msg="  --disable-endpoint-crd='false'" subsys=daemon
level=info msg="  --disable-envoy-version-check='false'" subsys=daemon
level=info msg="  --disable-iptables-feeder-rules=''" subsys=daemon
level=info msg="  --dns-max-ips-per-restored-rule='1000'" subsys=daemon
level=info msg="  --dns-policy-unload-on-shutdown='false'" subsys=daemon
level=info msg="  --dnsproxy-concurrency-limit='0'" subsys=daemon
level=info msg="  --dnsproxy-concurrency-processing-grace-period='0s'" subsys=daemon
level=info msg="  --dnsproxy-lock-count='128'" subsys=daemon
level=info msg="  --dnsproxy-lock-timeout='500ms'" subsys=daemon
level=info msg="  --ec2-api-endpoint=''" subsys=daemon
level=info msg="  --egress-masquerade-interfaces=''" subsys=daemon
level=info msg="  --egress-multi-home-ip-rule-compat='false'" subsys=daemon
level=info msg="  --enable-auto-protect-node-port-range='true'" subsys=daemon
level=info msg="  --enable-bandwidth-manager='false'" subsys=daemon
level=info msg="  --enable-bbr='false'" subsys=daemon
level=info msg="  --enable-bgp-control-plane='false'" subsys=daemon
level=info msg="  --enable-bpf-clock-probe='true'" subsys=daemon
level=info msg="  --enable-bpf-masquerade='false'" subsys=daemon
level=info msg="  --enable-bpf-tproxy='false'" subsys=daemon
level=info msg="  --enable-cilium-endpoint-slice='false'" subsys=daemon
level=info msg="  --enable-custom-calls='false'" subsys=daemon
level=info msg="  --enable-endpoint-health-checking='true'" subsys=daemon
level=info msg="  --enable-endpoint-routes='true'" subsys=daemon
level=info msg="  --enable-envoy-config='false'" subsys=daemon
level=info msg="  --enable-external-ips='true'" subsys=daemon
level=info msg="  --enable-health-check-nodeport='true'" subsys=daemon
level=info msg="  --enable-health-checking='true'" subsys=daemon
level=info msg="  --enable-host-firewall='false'" subsys=daemon
level=info msg="  --enable-host-legacy-routing='false'" subsys=daemon
level=info msg="  --enable-host-port='true'" subsys=daemon
level=info msg="  --enable-hubble='false'" subsys=daemon
level=info msg="  --enable-hubble-recorder-api='true'" subsys=daemon
level=info msg="  --enable-icmp-rules='true'" subsys=daemon
level=info msg="  --enable-identity-mark='true'" subsys=daemon
level=info msg="  --enable-ip-masq-agent='false'" subsys=daemon
level=info msg="  --enable-ipsec='false'" subsys=daemon
level=info msg="  --enable-ipv4='true'" subsys=daemon
level=info msg="  --enable-ipv4-egress-gateway='false'" subsys=daemon
level=info msg="  --enable-ipv4-fragment-tracking='true'" subsys=daemon
level=info msg="  --enable-ipv4-masquerade='true'" subsys=daemon
level=info msg="  --enable-ipv6='false'" subsys=daemon
level=info msg="  --enable-ipv6-big-tcp='false'" subsys=daemon
level=info msg="  --enable-ipv6-masquerade='true'" subsys=daemon
level=info msg="  --enable-ipv6-ndp='false'" subsys=daemon
level=info msg="  --enable-k8s-api-discovery='false'" subsys=daemon
level=info msg="  --enable-k8s-endpoint-slice='true'" subsys=daemon
level=info msg="  --enable-k8s-event-handover='false'" subsys=daemon
level=info msg="  --enable-k8s-terminating-endpoint='true'" subsys=daemon
level=info msg="  --enable-l2-neigh-discovery='true'" subsys=daemon
level=info msg="  --enable-l7-proxy='true'" subsys=daemon
level=info msg="  --enable-local-node-route='true'" subsys=daemon
level=info msg="  --enable-local-redirect-policy='false'" subsys=daemon
level=info msg="  --enable-metrics='true'" subsys=daemon
level=info msg="  --enable-mke='false'" subsys=daemon
level=info msg="  --enable-monitor='true'" subsys=daemon
level=info msg="  --enable-nat46x64-gateway='false'" subsys=daemon
level=info msg="  --enable-node-port='false'" subsys=daemon
level=info msg="  --enable-pmtu-discovery='false'" subsys=daemon
level=info msg="  --enable-policy='default'" subsys=daemon
level=info msg="  --enable-recorder='false'" subsys=daemon
level=info msg="  --enable-remote-node-identity='true'" subsys=daemon
level=info msg="  --enable-runtime-device-detection='false'" subsys=daemon
level=info msg="  --enable-sctp='false'" subsys=daemon
level=info msg="  --enable-service-topology='false'" subsys=daemon
level=info msg="  --enable-session-affinity='false'" subsys=daemon
level=info msg="  --enable-srv6='false'" subsys=daemon
level=info msg="  --enable-stale-cilium-endpoint-cleanup='true'" subsys=daemon
level=info msg="  --enable-svc-source-range-check='true'" subsys=daemon
level=info msg="  --enable-tracing='false'" subsys=daemon
level=info msg="  --enable-unreachable-routes='false'" subsys=daemon
level=info msg="  --enable-vtep='false'" subsys=daemon
level=info msg="  --enable-well-known-identities='false'" subsys=daemon
level=info msg="  --enable-wireguard='false'" subsys=daemon
level=info msg="  --enable-wireguard-userspace-fallback='false'" subsys=daemon
level=info msg="  --enable-xdp-prefilter='false'" subsys=daemon
level=info msg="  --enable-xt-socket-fallback='true'" subsys=daemon
level=info msg="  --encrypt-interface=''" subsys=daemon
level=info msg="  --encrypt-node='false'" subsys=daemon
level=info msg="  --endpoint-gc-interval='5m0s'" subsys=daemon
level=info msg="  --endpoint-queue-size='25'" subsys=daemon
level=info msg="  --endpoint-status=''" subsys=daemon
level=info msg="  --eni-tags='{}'" subsys=daemon
level=info msg="  --envoy-config-timeout='2m0s'" subsys=daemon
level=info msg="  --envoy-log=''" subsys=daemon
level=info msg="  --exclude-local-address=''" subsys=daemon
level=info msg="  --fixed-identity-mapping=''" subsys=daemon
level=info msg="  --force-local-policy-eval-at-source='false'" subsys=daemon
level=info msg="  --fqdn-regex-compile-lru-size='1024'" subsys=daemon
level=info msg="  --gops-port='9890'" subsys=daemon
level=info msg="  --http-403-msg=''" subsys=daemon
level=info msg="  --http-idle-timeout='0'" subsys=daemon
level=info msg="  --http-max-grpc-timeout='0'" subsys=daemon
level=info msg="  --http-normalize-path='true'" subsys=daemon
level=info msg="  --http-request-timeout='3600'" subsys=daemon
level=info msg="  --http-retry-count='3'" subsys=daemon
level=info msg="  --http-retry-timeout='0'" subsys=daemon
level=info msg="  --hubble-disable-tls='false'" subsys=daemon
level=info msg="  --hubble-event-buffer-capacity='4095'" subsys=daemon
level=info msg="  --hubble-event-queue-size='0'" subsys=daemon
level=info msg="  --hubble-export-file-compress='false'" subsys=daemon
level=info msg="  --hubble-export-file-max-backups='5'" subsys=daemon
level=info msg="  --hubble-export-file-max-size-mb='10'" subsys=daemon
level=info msg="  --hubble-export-file-path=''" subsys=daemon
level=info msg="  --hubble-listen-address=''" subsys=daemon
level=info msg="  --hubble-metrics=''" subsys=daemon
level=info msg="  --hubble-metrics-server=''" subsys=daemon
level=info msg="  --hubble-prefer-ipv6='false'" subsys=daemon
level=info msg="  --hubble-recorder-sink-queue-size='1024'" subsys=daemon
level=info msg="  --hubble-recorder-storage-path='/var/run/cilium/pcaps'" subsys=daemon
level=info msg="  --hubble-skip-unknown-cgroup-ids='true'" subsys=daemon
level=info msg="  --hubble-socket-path='/var/run/cilium/hubble.sock'" subsys=daemon
level=info msg="  --hubble-tls-cert-file=''" subsys=daemon
level=info msg="  --hubble-tls-client-ca-files=''" subsys=daemon
level=info msg="  --hubble-tls-key-file=''" subsys=daemon
level=info msg="  --identity-allocation-mode='crd'" subsys=daemon
level=info msg="  --identity-change-grace-period='5s'" subsys=daemon
level=info msg="  --identity-gc-interval='15m0s'" subsys=daemon
level=info msg="  --identity-heartbeat-timeout='30m0s'" subsys=daemon
level=info msg="  --identity-restore-grace-period='10m0s'" subsys=daemon
level=info msg="  --install-egress-gateway-routes='false'" subsys=daemon
level=info msg="  --install-iptables-rules='true'" subsys=daemon
level=info msg="  --install-no-conntrack-iptables-rules='false'" subsys=daemon
level=info msg="  --ip-allocation-timeout='2m0s'" subsys=daemon
level=info msg="  --ip-masq-agent-config-path='/etc/config/ip-masq-agent'" subsys=daemon
level=info msg="  --ipam='kubernetes'" subsys=daemon
level=info msg="  --ipsec-key-file=''" subsys=daemon
level=info msg="  --iptables-lock-timeout='5s'" subsys=daemon
level=info msg="  --iptables-random-fully='false'" subsys=daemon
level=info msg="  --ipv4-native-routing-cidr=''" subsys=daemon
level=info msg="  --ipv4-node='auto'" subsys=daemon
level=info msg="  --ipv4-pod-subnets=''" subsys=daemon
level=info msg="  --ipv4-range='auto'" subsys=daemon
level=info msg="  --ipv4-service-loopback-address='169.254.42.1'" subsys=daemon
level=info msg="  --ipv4-service-range='auto'" subsys=daemon
level=info msg="  --ipv6-cluster-alloc-cidr='f00d::/64'" subsys=daemon
level=info msg="  --ipv6-mcast-device=''" subsys=daemon
level=info msg="  --ipv6-native-routing-cidr=''" subsys=daemon
level=info msg="  --ipv6-node='auto'" subsys=daemon
level=info msg="  --ipv6-pod-subnets=''" subsys=daemon
level=info msg="  --ipv6-range='auto'" subsys=daemon
level=info msg="  --ipv6-service-range='auto'" subsys=daemon
level=info msg="  --join-cluster='false'" subsys=daemon
level=info msg="  --k8s-api-server=''" subsys=daemon
level=info msg="  --k8s-client-burst='0'" subsys=daemon
level=info msg="  --k8s-client-qps='0'" subsys=daemon
level=info msg="  --k8s-heartbeat-timeout='30s'" subsys=daemon
level=info msg="  --k8s-kubeconfig-path=''" subsys=daemon
level=info msg="  --k8s-namespace='kube-system'" subsys=daemon
level=info msg="  --k8s-require-ipv4-pod-cidr='false'" subsys=daemon
level=info msg="  --k8s-require-ipv6-pod-cidr='false'" subsys=daemon
level=info msg="  --k8s-service-cache-size='128'" subsys=daemon
level=info msg="  --k8s-service-proxy-name=''" subsys=daemon
level=info msg="  --k8s-sync-timeout='3m0s'" subsys=daemon
level=info msg="  --k8s-watcher-endpoint-selector='metadata.name!=kube-scheduler,metadata.name!=kube-controller-manager,metadata.name!=etcd-operator,metadata.name!=gcp-controller-manager'" subsys=daemon
level=info msg="  --keep-config='false'" subsys=daemon
level=info msg="  --kube-proxy-replacement='strict'" subsys=daemon
level=info msg="  --kube-proxy-replacement-healthz-bind-address=''" subsys=daemon
level=info msg="  --kvstore=''" subsys=daemon
level=info msg="  --kvstore-connectivity-timeout='2m0s'" subsys=daemon
level=info msg="  --kvstore-lease-ttl='15m0s'" subsys=daemon
level=info msg="  --kvstore-max-consecutive-quorum-errors='2'" subsys=daemon
level=info msg="  --kvstore-opt=''" subsys=daemon
level=info msg="  --kvstore-periodic-sync='5m0s'" subsys=daemon
level=info msg="  --label-prefix-file=''" subsys=daemon
level=info msg="  --labels=''" subsys=daemon
level=info msg="  --lib-dir='/var/lib/cilium'" subsys=daemon
level=info msg="  --local-max-addr-scope='252'" subsys=daemon
level=info msg="  --local-router-ipv4=''" subsys=daemon
level=info msg="  --local-router-ipv6=''" subsys=daemon
level=info msg="  --log-driver=''" subsys=daemon
level=info msg="  --log-opt=''" subsys=daemon
level=info msg="  --log-system-load='false'" subsys=daemon
level=info msg="  --max-controller-interval='0'" subsys=daemon
level=info msg="  --metrics=''" subsys=daemon
level=info msg="  --mke-cgroup-mount=''" subsys=daemon
level=info msg="  --monitor-aggregation='medium'" subsys=daemon
level=info msg="  --monitor-aggregation-flags='all'" subsys=daemon
level=info msg="  --monitor-aggregation-interval='5s'" subsys=daemon
level=info msg="  --monitor-queue-size='0'" subsys=daemon
level=info msg="  --mtu='0'" subsys=daemon
level=info msg="  --node-port-acceleration='disabled'" subsys=daemon
level=info msg="  --node-port-algorithm='random'" subsys=daemon
level=info msg="  --node-port-bind-protection='true'" subsys=daemon
level=info msg="  --node-port-mode='snat'" subsys=daemon
level=info msg="  --node-port-range='30000,32767'" subsys=daemon
level=info msg="  --nodes-gc-interval='5m0s'" subsys=daemon
level=info msg="  --operator-api-serve-addr='127.0.0.1:9234'" subsys=daemon
level=info msg="  --operator-prometheus-serve-addr=':9963'" subsys=daemon
level=info msg="  --policy-audit-mode='false'" subsys=daemon
level=info msg="  --policy-queue-size='100'" subsys=daemon
level=info msg="  --policy-trigger-interval='1s'" subsys=daemon
level=info msg="  --pprof='false'" subsys=daemon
level=info msg="  --pprof-address='localhost'" subsys=daemon
level=info msg="  --pprof-port='6060'" subsys=daemon
level=info msg="  --preallocate-bpf-maps='false'" subsys=daemon
level=info msg="  --prepend-iptables-chains='true'" subsys=daemon
level=info msg="  --procfs='/host/proc'" subsys=daemon
level=info msg="  --prometheus-serve-addr=':9962'" subsys=daemon
level=info msg="  --proxy-connect-timeout='1'" subsys=daemon
level=info msg="  --proxy-gid='1337'" subsys=daemon
level=info msg="  --proxy-max-connection-duration-seconds='0'" subsys=daemon
level=info msg="  --proxy-max-requests-per-connection='0'" subsys=daemon
level=info msg="  --proxy-prometheus-port='9964'" subsys=daemon
level=info msg="  --read-cni-conf=''" subsys=daemon
level=info msg="  --remove-cilium-node-taints='true'" subsys=daemon
level=info msg="  --restore='true'" subsys=daemon
level=info msg="  --route-metric='0'" subsys=daemon
level=info msg="  --set-cilium-is-up-condition='true'" subsys=daemon
level=info msg="  --sidecar-istio-proxy-image='cilium/istio_proxy'" subsys=daemon
level=info msg="  --single-cluster-route='false'" subsys=daemon
level=info msg="  --skip-cnp-status-startup-clean='false'" subsys=daemon
level=info msg="  --socket-path='/var/run/cilium/cilium.sock'" subsys=daemon
level=info msg="  --sockops-enable='false'" subsys=daemon
level=info msg="  --srv6-encap-mode='reduced'" subsys=daemon
level=info msg="  --state-dir='/var/run/cilium'" subsys=daemon
level=info msg="  --synchronize-k8s-nodes='true'" subsys=daemon
level=info msg="  --tofqdns-dns-reject-response-code='refused'" subsys=daemon
level=info msg="  --tofqdns-enable-dns-compression='true'" subsys=daemon
level=info msg="  --tofqdns-endpoint-max-ip-per-hostname='50'" subsys=daemon
level=info msg="  --tofqdns-idle-connection-grace-period='0s'" subsys=daemon
level=info msg="  --tofqdns-max-deferred-connection-deletes='10000'" subsys=daemon
level=info msg="  --tofqdns-min-ttl='3600'" subsys=daemon
level=info msg="  --tofqdns-pre-cache=''" subsys=daemon
level=info msg="  --tofqdns-proxy-port='0'" subsys=daemon
level=info msg="  --tofqdns-proxy-response-max-delay='100ms'" subsys=daemon
level=info msg="  --trace-payloadlen='128'" subsys=daemon
level=info msg="  --trace-sock='true'" subsys=daemon
level=info msg="  --tunnel='vxlan'" subsys=daemon
level=info msg="  --tunnel-port='0'" subsys=daemon
level=info msg="  --unmanaged-pod-watcher-interval='15'" subsys=daemon
level=info msg="  --version='false'" subsys=daemon
level=info msg="  --vlan-bpf-bypass=''" subsys=daemon
level=info msg="  --vtep-cidr=''" subsys=daemon
level=info msg="  --vtep-endpoint=''" subsys=daemon
level=info msg="  --vtep-mac=''" subsys=daemon
level=info msg="  --vtep-mask=''" subsys=daemon
level=info msg="  --write-cni-conf-when-ready=''" subsys=daemon
level=info msg="     _ _ _" subsys=daemon
level=info msg=" ___|_| |_|_ _ _____" subsys=daemon
level=info msg="|  _| | | | | |     |" subsys=daemon
level=info msg="|___|_|_|_|___|_|_|_|" subsys=daemon
level=info msg="Cilium 1.13.2 8cb94c70 2023-04-17T23:19:21+02:00 go version go1.19.8 linux/amd64" subsys=daemon
level=info msg="cilium-envoy  version: 9d3ac11580c59ef31521f2b3df33014642c1fd3b/1.23.8/Distribution/RELEASE/BoringSSL" subsys=daemon
level=info msg="clang (10.0.0) and kernel (5.15.0) versions: OK!" subsys=linux-datapath
level=info msg="linking environment: OK!" subsys=linux-datapath
level=info msg="Kernel config file not found: if the agent fails to start, check the system requirements at https://docs.cilium.io/en/stable/operations/system_requirements" subsys=probes
level=info msg="Detected mounted BPF filesystem at /sys/fs/bpf" subsys=bpf
level=info msg="Mounted cgroupv2 filesystem at /run/cilium/cgroupv2" subsys=cgroups
level=info msg="Parsing base label prefixes from default label list" subsys=labels-filter
level=info msg="Parsing additional label prefixes from user inputs: []" subsys=labels-filter
level=info msg="Final label prefixes to be used for identity evaluation:" subsys=labels-filter
level=info msg=" - reserved:.*" subsys=labels-filter
level=info msg=" - :io\\.kubernetes\\.pod\\.namespace" subsys=labels-filter
level=info msg=" - :io\\.cilium\\.k8s\\.namespace\\.labels" subsys=labels-filter
level=info msg=" - :app\\.kubernetes\\.io" subsys=labels-filter
level=info msg=" - !:io\\.kubernetes" subsys=labels-filter
level=info msg=" - !:kubernetes\\.io" subsys=labels-filter
level=info msg=" - !:.*beta\\.kubernetes\\.io" subsys=labels-filter
level=info msg=" - !:k8s\\.io" subsys=labels-filter
level=info msg=" - !:pod-template-generation" subsys=labels-filter
level=info msg=" - !:pod-template-hash" subsys=labels-filter
level=info msg=" - !:controller-revision-hash" subsys=labels-filter
level=info msg=" - !:annotation.*" subsys=labels-filter
level=info msg=" - !:etcd_node" subsys=labels-filter
level=info msg="Using autogenerated IPv4 allocation range" subsys=node v4Prefix=10.74.0.0/16
level=info msg=Invoked duration="712.993µs" function="gops.registerGopsHooks (cell.go:39)" subsys=hive
level=info msg=Invoked duration=6.244811ms function="cmd.glob..func2 (daemon_main.go:1644)" subsys=hive
level=info msg="Started gops server" address="127.0.0.1:9890" subsys=gops
level=info msg="Start hook executed" duration="319.341µs" function="gops.registerGopsHooks.func1 (cell.go:44)" subsys=hive
level=info msg="Establishing connection to apiserver" host="https://192.168.44.4:6443" subsys=k8s-client
level=info msg="Connected to apiserver" subsys=k8s-client
level=info msg="Start hook executed" duration=9.136133ms function="client.(*compositeClientset).onStart" subsys=hive
level=info msg="Start hook executed" duration=2.170446ms function="cmd.newDatapath.func1 (daemon_main.go:1624)" subsys=hive
level=info msg="Start hook executed" duration="3.356µs" function="*resource.resource[*k8s.io/api/core/v1.Node].Start" subsys=hive
level=info msg="Start hook executed" duration=922ns function="*resource.resource[*github.com/cilium/cilium/pkg/k8s/apis/cilium.io/v2.CiliumNode].Start" subsys=hive
level=info msg="Auto-enabling \"enable-node-port\", \"enable-external-ips\", \"bpf-lb-sock\", \"enable-host-port\", \"enable-session-affinity\" features" subsys=daemon
level=info msg="Inheriting MTU from external network interface" device=ens3 ipAddr=192.168.42.74 mtu=1500 subsys=mtu
level=error msg="Start hook failed" error="daemon creation failed: unable to setup device manager: protocol not supported" function="cmd.newDaemonPromise.func1 (daemon_main.go:1677)" subsys=hive
level=info msg="Stop hook executed" duration="9.758µs" function="*resource.resource[*github.com/cilium/cilium/pkg/k8s/apis/cilium.io/v2.CiliumNode].Stop" subsys=hive
level=info msg="Stop hook executed" duration="3.627µs" function="*resource.resource[*k8s.io/api/core/v1.Node].Stop" subsys=hive
level=info msg="Stop hook executed" duration="15.639µs" function="client.(*compositeClientset).onStop" subsys=hive
level=info msg="Stopped gops server" address="127.0.0.1:9890" subsys=gops
level=info msg="Stop hook executed" duration="114.696µs" function="gops.registerGopsHooks.func2 (cell.go:51)" subsys=hive
level=fatal msg="failed to start: daemon creation failed: unable to setup device manager: protocol not supported" subsys=daemon

@imscaradh Can you share your terraform setup so I can try to reproduce the issues? if not, is cilium the only pod to fail? does it resolve itself after 5-10min? are you deploying on Infomaniak or another openstack provider?

@zifeo I am deploying to the Infomaniak cloud. My Terraform setup (I tried various combinations with Ubuntu 22.04/20.04 and k8s v1.27/1.26):

module "rke2" {
  # source = "zifeo/rke2/openstack"
  source = "./../.."

  # must be true for single-server cluster or only on first run for HA cluster
  bootstrap                 = true
  name                      = "staging-openstack"
  #ssh_public_key_file      = "~/.ssh/id_rsa.pub"
  floating_pool             = "ext-floating1"

  # should be restricted to a secure bastion
  rules_ssh_cidr            = "0.0.0.0/0"
  rules_k8s_cidr            = "0.0.0.0/0"
  # auto load manifest form a folder (https://docs.rke2.io/advanced#auto-deploying-manifests)
  manifests_folder          = "./manifests"

  servers = [{
    name             = "server"

    flavor_name      = "a4-ram8-disk0"
    image_name       = "Ubuntu 22.04 LTS Jammy Jellyfish"
    system_user      = "ubuntu"
    boot_volume_size = 20

    rke2_version     = "v1.27.2+rke2r1"
    rke2_volume_size = 6
    # https://docs.rke2.io/install/install_options/server_config/
    rke2_config = <<EOF
etcd-snapshot-schedule-cron: "0 */6 * * *"
etcd-snapshot-retention: 20

control-plane-resource-requests: kube-apiserver-cpu=75m,kube-apiserver-memory=128M,kube-scheduler-cpu=75m,kube-scheduler-memory=128M,kube-controller-manager-cpu=75m,kube-controller-manager-memory=128M,etcd-cpu=75m,etcd-memory=128M
  EOF
  }]

  agents = [
    {
      name             = "pool-a"
      nodes_count      = 1

      flavor_name      = "a4-ram8-disk0"
      image_name       = "Ubuntu 22.04 LTS Jammy Jellyfish"
      system_user      = "ubuntu"
      boot_volume_size = 20

      rke2_version     = "v1.27.2+rke2r1"
      rke2_volume_size = 6
    }
  ]

  # enable automatically agent removal of the cluster
  ff_autoremove_agent = true
  # rewrite kubeconfig
  ff_write_kubeconfig = true
  # deploy etcd backup
  ff_native_backup = true

  identity_endpoint     = "https://api.pub1.infomaniak.cloud/identity"
  object_store_endpoint = "s3.pub1.infomaniak.cloud"
}

provider "openstack" {
  tenant_name = "<censored>"
  user_name   = "<censored>"
  password = "<censored>"
  auth_url = "https://api.pub1.infomaniak.cloud/identity"
  region   = "dc3-a"
}

terraform {
  required_version = ">= 0.14.0"

  required_providers {
    openstack = {
      source  = "terraform-provider-openstack/openstack"
      version = "~> 1.51.1"
    }
  }
}

@zifeo the only thing I found was cilium/cilium#20901, but this is related to missing Kernel modules (and Infomaniak isn't using RaspberryPis 😄 )

@imscaradh thanks, I manage to reproduce and test with other config. So far, it seems to be an issue related with the image that Infomaniak provider and/or some of their kernels. What worked last week doesn't this week. Let's me try to debug this further with their support and get back.

@imscaradh the issue was identified (regression with default volume mapping). Can you try and confirm all is good with v2.0.5?