projectcalico / canal

Policy based networking for cloud native applications

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Network for newly created pods fails

felskrone opened this issue · comments

I trying to setup my k8s-cluster with canal, but fail to do so due to errors in the kubelets logfile or rather setting up the network properly.

Canal from here https://github.com/projectcalico/canal/blob/master/k8s-install/1.7/canal.yaml
RBAC from https://github.com/projectcalico/canal/blob/master/k8s-install/1.7/rbac.yaml

Installation of canal looks good.

master01: # kubectl create -f rbac.yaml
clusterrole "calico" created
clusterrole "flannel" created
clusterrolebinding "canal-flannel" created
clusterrolebinding "canal-calico" created

master01: # kubectl create -f canal.yaml
configmap "canal-config" created
daemonset "canal" created
customresourcedefinition "globalfelixconfigs.crd.projectcalico.org" created
customresourcedefinition "globalbgpconfigs.crd.projectcalico.org" created
customresourcedefinition "ippools.crd.projectcalico.org" created
customresourcedefinition "globalnetworkpolicies.crd.projectcalico.org" created
serviceaccount "canal" created

All pods seem to come up fine.

master01: # kubectl get pods --all-namespaces
NAMESPACE     NAME          READY     STATUS    RESTARTS   AGE
kube-system   canal-48b2q   3/3       Running   1          1m
kube-system   canal-55l5s   3/3       Running   0          1m
kube-system   canal-85h8c   3/3       Running   1          1m
kube-system   canal-9mkl5   3/3       Running   1          1m
kube-system   canal-gfzsf   3/3       Running   0          1m
kube-system   canal-jklmk   3/3       Running   0          1m
kube-system   canal-k5l5d   3/3       Running   1          1m
kube-system   canal-r13bp   3/3       Running   0          1m
kube-system   canal-s768v   3/3       Running   0          1m
kube-system   canal-x3b57   3/3       Running   1          1m

After that i create a simple busybox pod.

apiVersion: v1
kind: Pod
metadata:
  name: busybox-w2
  namespace: default
spec:
  containers:
  - image: busybox
    command:
      - sleep
      - "3600"
    imagePullPolicy: IfNotPresent
    name: busybox
  restartPolicy: Always

The busybox-pod never receives a proper network-config and stays in state 'ContainerCreating'.

master01: # kubectl get pods --all-namespaces
NAMESPACE     NAME          READY     STATUS              RESTARTS   AGE
default       busybox-w2    0/1       ContainerCreating   0          31s

Expected Behavior

Successful pod-creating and proper setup of networking for the pod.

Current Behavior

The pod is never created and the kubelet states that SSL-Cert-files are missing.

Nov 03 16:29:49 worker02 kubelet[12112]: I1103 16:29:49.095395   12112 kuberuntime_manager.go:557] SyncPod received new pod "busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)", will create a new sandbox for it
Nov 03 16:29:49 worker02 kubelet[12112]: I1103 16:29:49.095413   12112 kuberuntime_manager.go:566] Stopping PodSandbox for "busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)", will start new one
Nov 03 16:29:49 worker02 kubelet[12112]: I1103 16:29:49.095449   12112 kuberuntime_manager.go:612] Creating sandbox for pod "busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)"
Nov 03 16:29:49 worker02 kubelet[12112]: E1103 16:29:49.525442   12112 remote_runtime.go:91] RunPodSandbox from runtime service failed: rpc error: code = 2 desc = failed to create network for container k8s_infra_busybox-w2_default_27c6d382-c0aa-11e7-bc63-0022195f6b5b_0 in sandbox e846f3ac1bffad9bf76eccac8130104e3f62a2456dc82454a6af7d54ceba705e: open /etc/cni/net.d/calico-tls/etcd-cert: no such file or directory
Nov 03 16:29:49 worker02 kubelet[12112]: E1103 16:29:49.525511   12112 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)" failed: rpc error: code = 2 desc = failed to create network for container k8s_infra_busybox-w2_default_27c6d382-c0aa-11e7-bc63-0022195f6b5b_0 in sandbox e846f3ac1bffad9bf76eccac8130104e3f62a2456dc82454a6af7d54ceba705e: open /etc/cni/net.d/calico-tls/etcd-cert: no such file or directory
Nov 03 16:29:49 worker02 kubelet[12112]: E1103 16:29:49.525537   12112 kuberuntime_manager.go:618] createPodSandbox for pod "busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)" failed: rpc error: code = 2 desc = failed to create network for container k8s_infra_busybox-w2_default_27c6d382-c0aa-11e7-bc63-0022195f6b5b_0 in sandbox e846f3ac1bffad9bf76eccac8130104e3f62a2456dc82454a6af7d54ceba705e: open /etc/cni/net.d/calico-tls/etcd-cert: no such file or directory
Nov 03 16:29:49 worker02 kubelet[12112]: E1103 16:29:49.525595   12112 pod_workers.go:182] Error syncing pod 27c6d382-c0aa-11e7-bc63-0022195f6b5b ("busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)"), skipping: failed to "CreatePodSandbox" for "busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)" with CreatePodSandboxError: "CreatePodSandbox for pod \"busybox-w2_default(27c6d382-c0aa-11e7-bc63-0022195f6b5b)\" failed: rpc error: code = 2 desc = failed to create network for container k8s_infra_busybox-w2_default_27c6d382-c0aa-11e7-bc63-0022195f6b5b_0 in sandbox e846f3ac1bffad9bf76eccac8130104e3f62a2456dc82454a6af7d54ceba705e: open /etc/cni/net.d/calico-tls/etcd-cert: no such file or directory"

I have not altered the rbac.yaml or canal.yaml in any way, and there is no ssl-configuration in there.

The canal-pods 10-calico.conf also has no ssl-stuff in it.

worker02:/etc/cni/net.d# cat 10-calico.conf
{
    "name": "k8s-pod-network",
    "cniVersion": "0.1.0",
    "type": "calico",
    "log_level": "info",
    "datastore_type": "kubernetes",
    "nodename": "worker02",
    "mtu": 1500,
    "ipam": {
        "type": "host-local",
        "subnet": "usePodCidr"
    },
    "policy": {
        "type": "k8s",
        "k8s_auth_token": "eyJhbGciOiJSUzI1NiIsInR5c....."
    },
    "kubernetes": {
        "k8s_api_root": "https://10.x.x.x:443",
        "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
    }
}

Where does the path '/etc/cni/net.d/calico-tls/etcd-cert' come from in the kubelets logs?

Where can i debug whats going wrong?

Since the above canal.yaml uses the kubernetes-datastore, why is etcd involved in any way?

Your Environment

  • Calico version: quay.io/calico/node:v2.5.1 (see linked canal.yaml above)
  • Flannel version: quay.io/coreos/flannel:v0.8.0 (see linked canal.yaml above)
  • Orchestrator version: Kubernetes 1.7.6 with RBAC
  • Operating System and version: Debian Stretch

@felskrone interesting. Agreed that using the kubernetes api datastore, etcd certs shouldn't be involved at all.

Are there any other config files in /etc/cni/net.d?

I have not figured whats wrong, but i restarted from scratch and that seems to have resolved this.

I have another question, but its not related to this error.