kubectl-proxy doesn't work with netpols
george-angel opened this issue · comments
Expected Behavior
Given a basic NetworkPolicy that allow traffic within namespace, we would like to have kubectl-proxy functionality to be able to reach pods in that namespace.
---
# Default: Allow all egress and ingress only from same ns
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default
namespace: sys-mon
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: sys-mon
- namespaceSelector:
matchLabels:
name: sys-prom
egress:
- {}
---
# Ingress Proxy: Private
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-sys-ingress-priv
namespace: sys-mon
spec:
podSelector:
matchExpressions:
- key: app
operator: In
values:
- grafana
- prometheus
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: sys-ingress-priv
Policy above allows all traffic within the namespace, traffic from sys-prom
namespace, and traffic from sys-ingress-priv
- which is a namespace that contains internally facing ingress controllers.
With that policy applied:
timeout 2 curl -sSL -D - -o /dev/null http://localhost:8001/api/v1/namespaces/sys-mon/services/prometheus:9090/proxy/graph
request fails. We also tried allowing traffic from kube-system
namespace, but the result is the same, request times out.
If we delete the NetworkPolicy, the request returns 200. We also tcpdumped all traffic on masters and workers to capture this:
master:
10:11:26.914112 IP (tos 0x0, ttl 64, id 23194, offset 0, flags [none], proto UDP (17), length 332)
10.66.25.79.57534 > 10.66.25.73.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 43828, offset 0, flags [DF], proto TCP (6), length 282)
10.2.6.0.46656 > 10.2.2.6.9090: Flags [P.], cksum 0xc670 (correct), seq 1:231, ack 1, win 209, options [nop,nop,TS val 747664665 ecr 3351424972], length 230
E..LZ...@...
B.O
B.I..!..8B...........8.C...N..x..E....4@.@.r.
...
....@#.....e6|......p.....
,.u.....GET /graph HTTP/1.1
Host: localhost:8001
User-Agent: curl/7.59.0
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 127.0.0.1, 91.217.237.4
X-Forwarded-Uri: /api/v1/namespaces/sys-mon/services/prometheus:9090/proxy/graph
worker:
10:11:26.914241 IP (tos 0x0, ttl 64, id 23194, offset 0, flags [none], proto UDP (17), length 332)
10.66.25.79.57534 > 10.66.25.73.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 1
IP (tos 0x0, ttl 64, id 43828, offset 0, flags [DF], proto TCP (6), length 282)
10.2.6.0.46656 > 10.2.2.6.9090: Flags [P.], cksum 0xc670 (correct), seq 1:231, ack 1, win 209, options [nop,nop,TS val 747664665 ecr 3351424972], length 230
E..LZ...@...
B.O
B.I..!..8B...........8.C...N..x..E....4@.@.r.
...
....@#.....e6|......p.....
,.u.....GET /graph HTTP/1.1
Host: localhost:8001
User-Agent: curl/7.59.0
Accept: */*
Accept-Encoding: gzip
X-Forwarded-For: 127.0.0.1, 91.217.237.4
X-Forwarded-Uri: /api/v1/namespaces/sys-mon/services/prometheus:9090/proxy/graph
Our api-server pods are running using host's networking:
kubectl --context=exp-2 -nkube-system -owide get pod | awk 'NR==1 || /api/'
NAME READY STATUS RESTARTS AGE IP NODE
kube-apiserver-ip-10-66-25-155.eu-west-1.compute.internal 1/1 Running 0 22h 10.66.25.155 ip-10-66-25-155.eu-west-1.compute.internal
kube-apiserver-ip-10-66-25-48.eu-west-1.compute.internal 1/1 Running 0 22h 10.66.25.48 ip-10-66-25-48.eu-west-1.compute.internal
kube-apiserver-ip-10-66-25-79.eu-west-1.compute.internal 1/1 Running 0 22h 10.66.25.79 ip-10-66-25-79.eu-west-1.compute.internal
So they don't have a separate IP in the pod range (10.2.0.0/16
) - so the request src address is 10.2.X.0
.
We can solve the problem by allowing kubelet/flannel/pods using host's network with following policy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: kubelet-masq
namespace: sys-mon
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- ipBlock:
cidr: 10.2.0.0/32
- ipBlock:
cidr: 10.2.1.0/32
- ipBlock:
cidr: 10.2.2.0/32
- ipBlock:
cidr: 10.2.3.0/32
- ipBlock:
cidr: 10.2.4.0/32
...
But that doesn't feel good, and it needs to be applied to every namespace.
Is this something that is supposed to work? Is it the specifics of your setup that create this odd situation?
Your Environment
calico/node:v3.0.4
calico/cni:v2.0.3
coreos/flannel:v0.10.0
Container Linux by CoreOS 1688.5.3 (Rhyolite)
@george-angel hm, yeah this is an interesting one.
I'm actually a bit fuzzy on the implementation details of kubectl proxy, but my understanding is that the requests will not go through your ingress controllers, which is probably why the second NP above isn't working as expected.
As you've discovered, since the pods are using host networking Kubernetes network policy doesn't apply to them the way it does to other pods.
One option might be to use a Calico GlobalNetworkPolicy
instead of k8s policies in order to make it a cluster-wide behavior instead of per-namespace. See here
You can also use host endpoints or network sets to help simplify the definition a bit, but neither of those are perfect since they aren't fully automated and still require someone to create the resources. Though, if you've got some sort of node config management in place then it could simply be a matter of adding a section to your terraform / ansible / etc to create the above.
Thanks @caseydavenport fixed using GlobalNetworkPolicies, for reference: utilitywarehouse/tf_kube_ignition#34