This project is intended to provide two things:
- Automation to quickly and easily spin up a Kubernetes lab to learn about its security on nearly any machine (Windows, Mac (both Intel and M1/M2), or Linux). It can run in as little as 2 CPUs and 4GB of RAM so it should run on even modest laptops.
- And, if you make a mistake, you can just delete this VM and re-run this automation to be back to a working environment in minutes (so can learn and tinker without any fear)!
- Various demonstration scenarios documented in this README to help you learn about many of the common Kubernetes security features and challenges, and how to address them, first-hand in that lab environment.
I'm a hands-on learner and built this for myself awhile back - and now I want to share it with the community.
This lab provisions an Ubuntu virtual machine (VM) with multipass and then installs microk8s within it.
Mac (via a VM managed by multipass):
- (If you don't already have it) Install Homebrew
- You can do this by running the command
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- You can do this by running the command
- Install microk8s with a
brew install microk8s
- Clone this repo -
git clone https://github.com/jasonumiker-sysdig/kubernetes-security-demos.git
- Run
setup-cluster/setup-microk8s-vm.sh
Windows (via a VM managed by multipass):
- Be running a Pro or Enterprise version of Windows 10/11 that can do Hyper-V
- Install microk8s - https://microk8s.io/docs/install-windows
- Install git - https://gitforwindows.org/
- Clone this repo -
git clone https://github.com/jasonumiker-sysdig/kubernetes-security-demos.git
- Run
setup-cluster/setup-microk8s-vm.sh
from within a git bash shell/session
Linux (via a VM managed by multipass):
- Install snap if it isn't included in your distro (e.g.
sudo dnf install snapd
on Fedora) - Run
snap install multipass
- Clone this repo -
git clone https://github.com/jasonumiker-sysdig/kubernetes-security-demos.git
- Run
setup-cluster/setup-microk8s-vm.sh
- Run
multipass shell microk8s-vm
OR
- Run
cd ~/.kube
andmultipass transfer microk8s-vm:/home/ubuntu/.kube/config config
to copy the Kubeconfig to your host. If you have installedkubectl
, and also havegit cloned
this git repo to the host in your home directory, then you can run the commands from there rather than a shell within the VM if you'd prefer.
Here are some other useful commands to manage that VM once it exists:
multipass stop microk8s-vm
- shut the VM downmultipass start microk8s-vm
- start it upmultipass delete microk8s-vm && multipass purge
- delete and then purge the VM
Regardless of how you have people authenticate/login to Kubernetes (AWS IAM Users/Roles for EKS, Google Account to GKE, OIDC to your identity provider, etc.) Kubernetes does its own authorization. It does this via its Role Based Access Control (RBAC) APIs.
At a high level, the way Kubernetes RBAC works is that you either assign your Users a ClusterRole, which gives them cluster-wide privileges, or you assign them a Role which restricts them to only have access to a particular Namespace within the cluster. A Namespace is a logical boundary and grouping within Kubernetes to isolate the resources of particular teams from one another - so if you put different teams in different Namespaces via Roles the idea is that they can safely share the cluster as they shouldn't be able to interfere with one another. This is known as multi-tenancy. As you'll see, there is a little more to it than that, though...
For both a Role as well as a ClusterRole you also assign what rules/permissions to it for what it can do. These are additive - in that there are no denys only allows.
Let's explore how this all works:
kubectl get pods -A
- We are currently signed in as the cluster-admin ClusterRole - we can do anything cluster-widekubectl api-resources
this shows all the different resources that we can control the use of in our RBAC.- It also shows which of them are Namespaced (can be managed by Roles) vs. which can't (and are therefore cluster-wide and need a ClusterRole to manage them)
- And it also shows the short names for each resource type (which you can use to save typing in kubectl)
kubectl get clusterrole admin -o yaml | less
(press space to page down and q to exit) - This built-in admin role can explicitly do everything - and so you can clone it and remove those things you don't want a user to be able to do. As you can see, the minute you don't do *'s there is quite a lot of YAML here to go through!kubectl get clusterrole admin -o yaml | wc -l
- 324 lines of it!- You can see the details about this and the other built-in Roles such as edit and view here
cd ~/kubernetes-security-demos
cat team1.yaml
- Here we're creating a new namespace, team1, and then creating the most basic and powerful Role possible that can do anything within that Namespace with *'s for apiGroups, Resources and Verbs. Then we're binding that new Role to a user named Jane.- This is perhaps overly permissive as it:
- Includes the verbs like Escalate and Impersonate that most users won't need.
- Allows the Role to create other Roles and bind Users to it within that Namespace
- Allows the Role to create/edit all the NetworkPolicy firewalls for that Namespace and its workloads
- Etc.
- But, just by using a Role and Namespace rather than a ClusterRole (which would be cluster-wide), we're still doing pretty well here.
- This is perhaps overly permissive as it:
- We have that team1 you saw as well as another similar Namespace and Role called Team2 that is bound to another user (John) - let's apply them!
kubectl apply -f team1.yaml && kubectl apply -f team2.yaml
kubectl config get-contexts
- Our two other users are already set up here in our kubectl -jane
who we just gave access to namespaceteam1
andjohn
who we just gave access to namespaceteam2
kubectl config use-context microk8s-jane
- we've just logged in as Jane insteadkubectl get pods -A
if we try to ask to see all the Pods in all the namespaces again we now get an error that we can't use the cluster scopekubectl get pods
removing the -A for all namespaces and it says we don't have any Pods in our team1 namespace - which we do have access to seecd ~/kubernetes-security-demos/demos/network-policy/hello-app
kubectl apply -f .
- Let's deploy an app to our namespace - which we do have access to dokubectl get pods
- As you can see we do have enough cluster access to deploy workloads within our team1 namespacekubectl describe deployments hello-client-allowed
- Note under Pod Template -> Environment that a Kubernetes secret (hello-secret) is getting mounted as the environment variable API_KEY at runtimekubectl exec -it deploy/hello-client-allowed -n team1 -- /bin/sh
and thenwhoami
(or whatever else you want to run) and thenexit
- Jane even has permission to connect interactively into all the Pods in their team1 namespace and run whatever commands she wants at runtime.kubectl apply -f ../../../team1-noexec.yaml
- If we set our Role to the those 324 explicit lines above but with thepod/exec
commented out then that will block us from being able to do this.- Note that we are still signed in as Jane - since we had put *'s we actually have the rights to change our own Role permissions in this way!
kubectl exec -it deploy/hello-client-allowed -n team1 -- /bin/sh
- trying it again you'll see that it is blockedkubectl config use-context microk8s-john
- Now lets flip to John who is restricted to the team2 namespacekubectl get pods
- we don't have any workloads deployed here yetkubectl get pods --namespace=team1
- and, as expected we are not allowed to interact with the one Jane deployed to team1
So, that was a very quick overview of how to configure multi-tenancy of Kubernetes at the control plane level via Namespaces and Roles. And, how much YAML it takes to move away from *'s for the resources and verbs in your Role definitions.
We are going to perform a variety of common container/Kubernetes exploits and then show how to block/defend against them as well as detect them if they happen in real-time with Falco.
Sysdig (the company that donated Falco to the CNCF - and that I work for) provides a general-purpose example exploit called Security Playground https://github.com/sysdiglabs/security-playground that is a Python app which just reads, writes and/or executes whatever paths you GET/POST against it. To understand a bit more about how that works, have a look at app.py.
The idea with this is is to imagine there is another critical remote code execution (RCE) vulnerability (CVE) that there is not yet a known - so your vulnerability scans don't pick it up. What can you do to detect that this is being exploited - and prevent/mitigate any damage it'd cause.
You can see various examples of how this works in the example-curls.sh file.
NOTE: This is deployed with a service of type NodePort - if you'd prefer it to be a load balancer then modify that manifest to reconfigure the Service as well as the bash script addresses how you'd prefer. Just be careful as this is a very insecure app (by design) - don't put it on the Internet!
Run the following commands:
kubectl config get-contexts
- confirm we are still signed in as John (to Namespace team2)cd ~/kubernetes-security-demos/demos/security-playground/
kubectl apply -f security-playground.yaml
- deploy security-playgroundkubectl get all
- We deployed a Deployment (that created a ReplicaSet that created a Pod) as well as a NodePort service exposing security-playground on port 30000 on our Node.curl ./example-curls.sh
to see all of the various commands we're going to run to exploit security-playground's remote code execution vulnerabilitykubectl config use-context microk8s
to change our context to the cluster admin so the script can get our Node IP (which requires a ClusterRole)./example-curls.sh
to run all of our example exploits
Watch the output scroll by to see this from the attacker's perspective.
In addition to our RCE vulnerability in the code, the security-playground.yaml example has three key security issues:
- It runs as root
- It is running with
hostPID: true
- It is running in a privileged securityContext
When these (mis)configurations are done together, they allow you to escape out of the container isolation boundaries and be root on the host. This allows you not just full control over the host but also over/within the other containers.
We used two tools to break out and escalate our privileges:
- nsenter which allows you to switch Linux namespaces (if you are allowed)
- Not to be confused with Kubernetes Namespaces, Linux Namespaces are a feature of the Linux kernel used by containers to isolate them from each other.
- crictl which is used to control the local container runtime containerd bypassing Kubernetes (if you can connect to the container socket)
The security-playground-restricted.yaml example fixes all these vulnerabilities in the following ways:
- We build a container image that runs as a non-root user (this required changes to the Dockerfile as you'll see in Dockerfile-unprivileged vs. Dockerfile).
- The PodSpec not only doesn't have hostPID and a privileged securityContext but it adds in the new Pod Security Admission (PSA) restricted mode for the namespace which ensures that they can't be added to the PodSpec to restore them.
- The restricted PSA also keeps us from trying to specify/restore root permissions (the original container could only run as Root but this one we could specify in the PodSpec to run it as root and it would still work).
Run the following:
cd ~/kubernetes-security-demos/demos/security-playground
cat security-playground-restricted.yaml
kubectl apply -f security-playground-restricted.yaml
./example-curls-restricted.sh
Comparing the results - this blocked almost everything that worked before:
security-playground | security-playground-restricted | |
---|---|---|
1 | allowed | blocked (by not running as root) |
2 | allowed | blocked (by not running as root) |
3 | allowed | blocked (by not running as root) |
4 | allowed | blocked (by not running as root and no hostPID and no privileged securityContext) |
5 | allowed | blocked (by not running as root and no hostPID and no privileged securityContext) |
6 | allowed | blocked (by not running as root and no hostPID and no privileged securityContext) |
7 | allowed | blocked (by not running as root and no hostPID and no privileged securityContext) |
8 | allowed | allowed |
So, even with still having this critical remote code execution vulnerability in our service, we still managed to block nearly everything through better configuration/posture for this workload!
And, on that last item, we are going to show how to block that via NetworkPolicies to limit the Internet egress to download the miner and/or allow it to connect to the miner pool as required in a later section.
There is now a feature built-in to Kubernetes (which GA'ed in 1.25) to enforce standards around these insure options in a PodSpec which undermine your workload/cluster security - Pod Security Admission.
This works by adding labels onto each Namespace. There are two standards that it can warn about and/or enforce for you - baseline and restricted.
- baseline - this prevents the worst of the parameters in the PodSpec such as hostPid and Privileged but still allows the container to run as root
- restricted - this goes further and blocks all insecure options including running as non-root
We enabled that for the Namespace security-playground-restricted - let's see how that works:
Run:
kubectl describe namespace security-playground-restricted
and note that we are both warning and enforcing the restricted standard here.kubectl apply -f security-playground.yaml -n security-playground-restricted
and see how our original insecure security-playground isn't allowed here by the PSA.
Getting all of your workload namespaces to baseline if not restricted makes a big difference in the security posture of your cluster.
In addition to PSAs, there are some free opensource tools like kubebench that can scan your cluster's posture against things like the CIS Benchmark. The CIS benchmark covers not just those options but many other aspects of cluster security.
To see this in action:
cd ~/kubernetes-security-demos/demos
kubectl apply -f kubebench-job.yaml
to deploy a one-time job to scan your cluster. You could change this job's Kubernetes spec to run regularly if you wanted.kubectl logs job/kube-bench
to have a look at the results:- The 5.2.2, 5.2.3, 5.2.7 failures if they had been fixed on security-playground would have prevented the attack. Many other things here would be a good idea to fix too in order to have the ideal security posture.
Finally, we've had a free opensource tool in our cluster all along here watching what we've been up to - Falco. Falco watches streams of data such as the Linux kernel syscalls on all your Nodes, as well as your Kubernetes audit trail, for suspicious behavior and can alert you to it in realtime. This is often referred to as "Runtime Threat Detection."
All of their events are aggregated by Falco Sidekick which can fan them out to any number of destinations such as your SIEM, your alerting systems like Pagerduty or your messaging tools like Slack.
It ships with a variety of sensible rules by default. You can find out more about those in this GitHub Repo - https://github.com/falcosecurity/rules
- Open Falcosidekick UI by going to port http://(Node IP):30282 on your Node and using the username/password of admin/admin
- You can run
kubectl get nodes -o wide
to find the IP address to use (the INTERNAL-IP)
- You can run
- Note the Rules that have been firing. Many of these things might not be issues but it is good that Falco has recorded them so we can decide if they are or they aren't in our case.
- Go to the Events Tab to see the Events in more detail.
- First we'll search for
playground
in the search box under the Sources dropdown then scroll to the bottom and increase the Rows per page to 50. Note the following events:Launch Privileged Container
andCreate Privileged Pod
- this is where security-playground was launched with privileges that we were able to exploit to escape the containerRead sensitive file untrusted
- this is where we tried to read /etc/shadowWrite below binary dir
- this happened whenever we wrote to /bin and /usr/bin (/bin/hello in the container and /usr/bin/crictl on the Node)Launch Package Management Process in Container
- this is where weapt install
'ed nmapDrop and execute new binary in container
- this happened whenever we added a new executable at runtime (that wasn't in the image) and ran it (nmap, xmrig, the dpkg to discover our architecture)Launch Suspicious Network Tool in Container
- this is where we ran nmap to perform a network discovery scanThe docker client is executed in a container
- this fires not just on the Docker CLI but also other similar tools like crictl and kubectl- You can see all of the commands we ran as we were breaking out of container including our psql SELECT in the cmdline of these events
Launch Ingress Remote File Copy Tools in Container
- this is where we ran wget to get crictl and xmrig
- (Optional) remove the search and/or go to the Dashboard tab to look around through the various other Events that Falco has caught during our session
- First we'll search for
While there are many tools available for this, Docker has a scan built-in to their CLI. Let's try using that one.
NOTE: This won't run within your microk8s VM and instead needs to run on a machine with Docker (Linux) or Docker Desktop (Windows or Mac) installed.
- Clone the repository if you haven't already on the machine running Docker
git clone https://github.com/jasonumiker-sysdig/kubernetes-security-demos.git
- Run
cd ~/kubernetes-security-demos/demos/security-playground/docker-build-security-playground
(assuming you cloned it to your home directory) - Run
docker build -t security-playground:latest .
- If you are running Docker Desktop then you should already have scout, otherwise run
curl -sSfL https://raw.githubusercontent.com/docker/scout-cli/main/install.sh | sh -s --
- Run
docker login
to log into Docker if you are not already - Run
docker scout cves security-playground:latest
as you can see there are many low severity vulnerabilities - Run
docker scout cves security-playground:latest --only-severity "critical, high"
to filter out anything that isn't a critical or a high - and now (as of today) I don't see any.
In this case we haven't even pushed this image to the registry yet and are able to see if it has vulnerabilities/CVEs we need to fix before pushing and deploying it.
Now let's look at how NetworkPolicies work and how to isolate network traffic within our cluster(s) - as well as egress traffic to the Internet
We had already deployed a workload in team1 that included a server Pod (hello-server) as well as two client Pods (hello-client-allowed and hello-client-blocked).
Out of the box all traffic is allowed which you can see as follows:
kubectl logs deployment/hello-client-allowed -n team1
as you can see it is getting a response from the serverkubectl logs deployment/hello-client-blocked -n team1
and our Pod to be blocked is not yet blocked and is getting a response from the server as wellcd ~/kubernetes-security-demos/demos/network-policy
cat example-curl-networkpolicy.sh
to see an example curl to try to hit hello-server (in Namespace team1) from security-playground (in Namespace team2)./example-curl-networkpolicy.sh
to run that and see the response
There are two common ways to write NetworkPolicies to allow/deny traffic dynamically between workloads within the cluster - against labels and against namespaces.
cat network-policy-namespace.yaml
As you can see here we are saying we are allowing traffic from Pods within the namespace team1 to Pods with the label app set to hello-server (implicitly also in the Namespace team1 where we are deploying the NetworkPolicy).kubectl apply -f network-policy-namespace.yaml
Lets apply that NetworkPolicykubectl logs deployment/hello-client-allowed -n team1
andkubectl logs deployment/hello-client-blocked -n team1
both of our Pods in team1 can still reach the server./example-curl-networkpolicy.sh
but our security-playground in team2 can't any longer (it will time out)
Now let's try it with labels - which is better for restricting traffic within a Namespace to least privilege:
cat network-policy-label.yaml
As you can see here we are saying we are allowing traffic from Pods with the label app set to hello to Pods with the label app set to hello-server.kubectl apply -f network-policy-label.yaml
Lets apply this NetworkPolicy (overwriting the last one as they have the same name)kubectl logs deployment/hello-client-blocked -n team1
And now we'll see that our blocked Pod where the app label is not set to hello is now being blocked by the NetworkPolicy./example-curl-networkpolicy.sh
and this also blocked our NetworkPolicy both because it isn't in the same Namespace and it doesn't have the correct label as well.
NetworkPolicies don't just help us control Ingress traffic, though, they can also help us control egress traffic - including preventing access to the Internet.
cat network-policy-deny-egress.yaml
this policy will deny all egress access to all pods in the Namespace it is deployed in.- Any required egress traffic will need an explicit allow - either added to this policy or in another one applied in the same Namespace
kubectl apply -f network-policy-deny-egress.yaml -n security-playground-restricted
to apply this to the security-playground-restricted Namespacekubectl delete --all pods --namespace=security-playground-restricted
- we had already downloaded xmrig to our running container - start a fresh one to properly test the NetworkPolicy../security-playground/example-curls-restricted.sh
to re-run our example-curls against security-playground-restricted. We've now blocked the entire attack - even while still having this critical remote code execution vulnerability in it!
That was a very basic introduction to NetworkPolicies. There are a number of other good/common examples on this site to explore the topic further - https://github.com/ahmetb/kubernetes-network-policy-recipes
There is also a great editor for NetworkPolicies at https://editor.networkpolicy.io/