kube-scheduler pod goes into CrashLoopBackOff status because of incorrect arguments

Question

kube-scheduler pod goes into CrashLoopBackOff status because of incorrect arguments

SohamChakraborty opened this issue 3 months ago · comments

Soham Chakraborty commented 3 months ago

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

$ kops version
Client version: 1.25.3 (git-v1.25.3)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:58:47Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.15", GitCommit:"b84cb8ab29366daa1bba65bc67f54de2f6c34848", GitTreeState:"clean", BuildDate:"2022-12-08T10:42:57Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"linux/amd64"}

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

We ran kops rolling-update cluster <cluster_name> --yes --cloudonly to re-create the nodes. The new master node that came up didn't have a running kube-scheduler pod.

5. What happened after the commands executed?

New nodes (master and worker) came up but master node stayed in NotReady status as CNI configuration was not found in /etc/cni/net.d directory. On further investigation, it was found that kube-scheduler pod was not running.

6. What did you expect to happen?
Master node will come up automatically in a healthy, functioning state with all kube-system pods running.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

kind: Cluster
metadata:
  creationTimestamp: null
  generation: 68
  name: k8s.foobar.com
spec:
  api:
    loadBalancer:
      class: Network
      type: Public
  authorization:
    rbac: {}
  channel: stable
  cloudLabels:
    App: k8s
    Env: 
    Region: us-east-1
  cloudProvider: aws
  clusterAutoscaler:
    awsUseStaticInstanceList: false
    balanceSimilarNodeGroups: false
    cpuRequest: 100m
    enabled: true
    expander: least-waste
    image: registry.k8s.io/autoscaling/cluster-autoscaler:v1.23.1
    memoryRequest: 300Mi
    newPodScaleUpDelay: 0s
    scaleDownDelayAfterAdd: 10m0s
    scaleDownUnneededTime: 5m0s
    scaleDownUnreadyTime: 10m0s
    scaleDownUtilizationThreshold: "0.6"
    skipNodesWithLocalStorage: true
    skipNodesWithSystemPods: true
  configBase: s3://com.foobar.k8s-state/k8s.foobar.com
  containerRuntime: docker
  dnsZone: 123456
  docker:
    experimental: true
    ipMasq: false
    ipTables: false
    logDriver: json-file
    logLevel: info
    logOpt:
    - max-size=10m
    - max-file=5
    storage: overlay2
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: master-us-east-1a
      name: a
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: master-us-east-1a
      name: a
    memoryRequest: 100Mi
    name: events
  fileAssets:
  - content: |
      apiVersion: audit.k8s.io/v1
      kind: Policy
      rules:
      - level: Metadata
    name: audit-policy-config
    path: /var/log/audit/policy-config.yaml
    roles:
    - Master
  iam:
    allowContainerRegistry: true
    legacy: false
  kubeAPIServer:
    auditLogMaxAge: 10
    auditLogMaxBackups: 1
    auditLogMaxSize: 100
    auditLogPath: /var/log/kube-apiserver-audit.log
    auditPolicyFile: /var/log/audit/policy-config.yaml
    auditWebhookBatchMaxWait: 5s
    auditWebhookConfigFile: /var/log/audit/webhook-config.yaml
  kubeDNS:
    provider: CoreDNS
  kubeScheduler:
    usePolicyConfigMap: true
  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook
    maxPods: 150
    shutdownGracePeriod: 1m0s
    shutdownGracePeriodCriticalPods: 30s
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.23.15
  masterInternalName: api.internal.k8s.foobar.com
  masterPublicName: api.k8s.foobar.com
  networkCIDR: 10.4.0.0/16
  networkID: vpc-123456
  networking:
    calico:
      awsSrcDstCheck: Disable
      encapsulationMode: ipip
      ipipMode: CrossSubnet
      wireguardEnabled: true
  nonMasqueradeCIDR: 100.64.0.0/10
  rollingUpdate:
    maxSurge: 4
  sshAccess:
  - 0.0.0.0/0
  subnets:
    <SNIPPED>
  topology:
    dns:
      type: Private
    masters: private
    nodes: private
---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2019-12-18T06:34:20Z"
  generation: 13
  labels:
    kops.k8s.io/cluster: k8s.foobar.com
  name: master-us-east-1a
spec:
  image: ubuntu/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20221212
  instanceMetadata:
    httpPutResponseHopLimit: 2
    httpTokens: required
  machineType: c5a.xlarge
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1a
  role: Master
  rootVolumeEncryption: true
  rootVolumeSize: 30
  subnets:
  - us-east-1a
---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2019-12-18T06:34:21Z"
  generation: 75
  labels:
    kops.k8s.io/cluster: k8s.foobar.com
  name: nodes-us-east-1a
spec:
  additionalUserData:
  - content: |
      apt-get update
      apt-get install -y qemu-user-static
    name: 0prereqs.sh
    type: text/x-shellscript
  cloudLabels:
    k8s.io/cluster-autoscaler/enabled: ""
    k8s.io/cluster-autoscaler/k8s.foobar.com: ""
  externalLoadBalancers:
  - targetGroupArn: <ELB_ARN>
  image: ubuntu/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20221212
  instanceMetadata:
    httpPutResponseHopLimit: 2
    httpTokens: required
  machineType: m5a.4xlarge
  maxSize: 7
  minSize: 2
  mixedInstancesPolicy:
    instances:
    - m5a.4xlarge
    - m5.4xlarge
    - m5d.4xlarge
    - m5ad.4xlarge
    - r5.4xlarge
    - r5a.4xlarge
    - r4.4xlarge
    - r5d.4xlarge
    - i3.4xlarge
    - r5ad.4xlarge
    - r5.8xlarge
    onDemandAboveBase: 0
    onDemandBase: 0
    spotAllocationStrategy: capacity-optimized
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-us-east-1a
  role: Node
  rootVolumeEncryption: true
  rootVolumeSize: 100
  subnets:
  - us-east-1a

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?

State of kube-scheduler pod:

kube-scheduler-ip-1-2-3-4.us-east-1.compute.internal            0/1     CrashLoopBackOff   5 (70s ago)     3m26s

Logs showed this at the end:

Error: unknown flag: --policy-configmap-namespace
2024/02/26 18:18:33 running command: exit status 1

The kube-scheduler manifest was this:

$ cat /etc/kubernetes/manifests/kube-scheduler.manifest 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
spec:
  containers:
  - args:
    - --log-file=/var/log/kube-scheduler.log
    - --also-stdout
    - /usr/local/bin/kube-scheduler
    - --authentication-kubeconfig=/var/lib/kube-scheduler/kubeconfig
    - --authorization-kubeconfig=/var/lib/kube-scheduler/kubeconfig
    - --config=/var/lib/kube-scheduler/config.yaml
    - --feature-gates=CSIMigrationAWS=true,InTreePluginAWSUnregister=true
    - --leader-elect=true
    - --policy-configmap-namespace=kube-system
    - --policy-configmap=scheduler-policy
    - --tls-cert-file=/srv/kubernetes/kube-scheduler/server.crt
    - --tls-private-key-file=/srv/kubernetes/kube-scheduler/server.key
    - --v=2
    command:
    - /go-runner
    image: registry.k8s.io/kube-scheduler:v1.23.15@sha256:9accf0bab7275b3a7704f5fcbc27d7a7820ce9209cffd4634214cfb4536fa4ca
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    volumeMounts:
    - mountPath: /var/lib/kube-scheduler
      name: varlibkubescheduler
      readOnly: true
    - mountPath: /srv/kubernetes/kube-scheduler
      name: srvscheduler
      readOnly: true
    - mountPath: /var/log/kube-scheduler.log
      name: logfile
  hostNetwork: true
  priorityClassName: system-cluster-critical
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  volumes:
  - hostPath:
      path: /var/lib/kube-scheduler
    name: varlibkubescheduler
  - hostPath:
      path: /srv/kubernetes/kube-scheduler
    name: srvscheduler
  - hostPath:
      path: /var/log/kube-scheduler.log
    name: logfile
status: {}

Had to remove both --policy-configmap-namespace=kube-system and --policy-configmap=scheduler-policy to get kube-schduler pod to run. The manifest after changing is:

$ cat /etc/kubernetes/manifests/kube-scheduler.manifest 
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
spec:
  containers:
  - args:
    - --log-file=/var/log/kube-scheduler.log
    - --also-stdout
    - /usr/local/bin/kube-scheduler
    - --authentication-kubeconfig=/var/lib/kube-scheduler/kubeconfig
    - --authorization-kubeconfig=/var/lib/kube-scheduler/kubeconfig
    - --config=/var/lib/kube-scheduler/config.yaml
    - --feature-gates=CSIMigrationAWS=true,InTreePluginAWSUnregister=true
    - --leader-elect=true
    - --tls-cert-file=/srv/kubernetes/kube-scheduler/server.crt
    - --tls-private-key-file=/srv/kubernetes/kube-scheduler/server.key
    - --v=2
    command:
    - /go-runner
    image: registry.k8s.io/kube-scheduler:v1.23.15@sha256:9accf0bab7275b3a7704f5fcbc27d7a7820ce9209cffd4634214cfb4536fa4ca
    livenessProbe:
      httpGet:
        host: 127.0.0.1
        path: /healthz
        port: 10259
        scheme: HTTPS
      initialDelaySeconds: 15
      timeoutSeconds: 15
    name: kube-scheduler
    resources:
      requests:
        cpu: 100m
    volumeMounts:
    - mountPath: /var/lib/kube-scheduler
      name: varlibkubescheduler
      readOnly: true
    - mountPath: /srv/kubernetes/kube-scheduler
      name: srvscheduler
      readOnly: true
    - mountPath: /var/log/kube-scheduler.log
      name: logfile
  hostNetwork: true
  priorityClassName: system-cluster-critical
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  volumes:
  - hostPath:
      path: /var/lib/kube-scheduler
    name: varlibkubescheduler
  - hostPath:
      path: /srv/kubernetes/kube-scheduler
    name: srvscheduler
  - hostPath:
      path: /var/log/kube-scheduler.log
    name: logfile
status: {}

Ciprian Hacman · Answer 1 · Thu Feb 29 2024 23:18:48 GMT+0800 (China Standard Time)

@SohamChakraborty Thank you for reporting this issue. The fix should be part of the future releases.
Please also remove kubeScheduler.usePolicyConfigMap from your config. That should fix the problem long term.

Soham Chakraborty · Answer 2 · Fri Mar 01 2024 00:27:00 GMT+0800 (China Standard Time)

Thank you @hakman for fixing this quick.