Popeye is a utility that scans live Kubernetes cluster and reports potential issues with deployed resources and configurations. It sanitizes your cluster based on what's deployed and not what's sitting on disk. By scanning your cluster, it detects misconfigurations and ensure best practices are in place thus preventing potential future headaches. It aims at reducing the cognitive overload one faces when operating a Kubernetes cluster in the wild. Furthermore, if your cluster employs a metric-server, it reports potential resources over/under allocations and attempts to warn you should your cluster run out of capacity.
Popeye is a readonly tool, it does not alter any of your Kubernetes resources in any way!
Popeye is available on Linux, OSX and Windows platforms.
-
Binaries for Linux, Windows and Mac are available as tarballs in the release page or via the SnapCraft link above.
-
For OSX/Unit using Homebrew/LinuxBrew
brew install derailed/popeye/popeye
-
Building from source Popeye was built using go 1.12+. In order to build Popeye from source you must:
-
Clone the repo
-
Add the following command in your go.mod file
replace ( github.com/derailed/popeye => MY_POPEYE_CLONED_GIT_REPO )
-
Build and run the executable
go run main.go
Quick recipe for the impatient:
# Clone outside of GOPATH git clone https://github.com/derailed/popeye cd popeye # Build and install go install # Run popeye
-
Popeye scans your cluster for best practices and potential issues. Currently, Popeye only looks at nodes, namespaces, pods and services. More will come soon! We are hoping Kubernetes friends will pitch'in to make Popeye even better.
The aim of the sanitizers is to pick up on misconfigurations ie things like ports mismatch, dead or unused resources, metrics utilization, probes, container images, RBAC rules, naked resources, etc...
Popeye is not another static analysis tool. It runs and inspect Kubernetes resources on live clusters and sanitize resources as they are in the wild!
Here is a list of sanitizers in place for the current release.
Resource | Sanitizers | Section | |
---|---|---|---|
π | Node | no | |
Conditions ie not ready, out of mem/disk, network, pids, etc | |||
Pod tolerations referencing node taints | |||
CPU/MEM utilization metrics, trips if over limits (default 80% CPU/MEM) | |||
π | Namespace | ns | |
Inactive | |||
Dead namespaces | |||
π | Pod | po | |
Pod status | |||
Containers statuses | |||
ServiceAccount presence | |||
CPU/MEM on containers over a set CPU/MEM limit (default 80% CPU/MEM) | |||
Container image with no tags | |||
Container image using latest tag |
|||
Resources request/limits presence | |||
Probes liveness/readiness presence | |||
Named ports and their references | |||
π | Service | svc | |
Endpoints presence | |||
Matching pods labels | |||
Named ports and their references | |||
π | ServiceAccount | sa | |
Unused, detects potentially unused SAs | |||
π | Secrets | sec | |
Unused, detects potentially unused secrets or associated keys | |||
π | ConfigMap | cm | |
Unused, detects potentially unused cm or associated keys | |||
π | Deployment | dp | |
Unused, pod template validation, resource utilization | |||
π | StatefulSet | sts | |
Unsed, pod template validation, resource utilization | |||
π | PersistentVolume | pv | |
Unused, check volume bound or volume error | |||
π | PersistentVolumeClaim | pvc | |
Unused, check bounded or volume mount error | |||
π | HorizontalPodAutoscaler | hpa | |
Unused, Utilization, Max burst checks | |||
π | PodDisruptionBudget | hpa | |
Unused, Check minAvailable configuration | pdb |
You can use Popeye standalone or using a spinach yaml config to tune the sanitizer. Details about the Popeye configuration file are below.
# Dump version info
popeye version
# Popeye a cluster using your current kubeconfig environment.
popeye
# Popeye uses a spinach config file of course! aka spinachyaml!
popeye -f spinach.yml
# Popeye a cluster using a kubeconfig context.
popeye --context olive
# Stuck?
popeye help
Alternatively, Popeye can be run directly on your Kubernetes clusters as a single shot or cronjob.
Here is a sample setup, please modify per your needs/wants. The manifests for this are in the k8s directory in this repo.
kubectl apply -f k8s/popeye/ns.yml && kubectl apply -f k8s/popeye
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: popeye
namespace: popeye
spec:
schedule: "* */1 * * *" # Fireoff Popeye once an hour
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
serviceAccountName: popeye
restartPolicy: Never
containers:
- name: popeye
image: quay.io/derailed/popeye:v0.3.6
imagePullPolicy: IfNotPresent
args:
- -o
- yaml
resources:
limits:
cpu: 500m
memory: 100Mi
In order for Popeye to do his work, the signed-in user must have enough RBAC oomph to get/list the resources mentioned above.
Sample Popeye RBAC Rules (Subject to change!!)
---
# Popeye ServiceAccount.
apiVersion: v1
kind: ServiceAccount
metadata:
name: popeye
namespace: popeye
# Popeye needs get/list access on the following Kubernetes resources.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: popeye
rules:
- apiGroups: [""]
resources:
- configmaps
- deployments
- endpoints
- horizontalpodautoscalers
- namespaces
- nodes
- persistentvolumes
- persistentvolumeclaims
- pods
- secrets
- serviceaccounts
- services
- statefulsets
verbs: ["get", "list"]
- apiGroups: ["rbac.authorization.k8s.io"]
resources:
- clusterroles
- clusterrolebindings
- roles
- rolebindings
verbs: ["get", "list"]
- apiGroups: ["metrics.k8s.io"]
resources:
- pods
- nodes
verbs: ["get", "list"]
---
# Binds Popeye to this ClusterRole.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: popeye
subjects:
- kind: ServiceAccount
name: popeye
namespace: popeye
roleRef:
kind: ClusterRole
name: popeye
apiGroup: rbac.authorization.k8s.io
NOTE: This file will change as Popeye matures!
As of this release the spinach.yml format has changed slightly. There is now a new excludes
section that allows one to exclude any Kubernetes resources from the sanitizer run. A resource is identified by a resource kind and a fully qualified resource name ie namespace/resource_name
. For example a pod named fred-1234 in namespace blee FQN will be blee/fred-1234
. This provides for differentiating fred/p1
and blee/p1
. For cluster wide resources, FQN=name
. Exclude rules can have either a straight string match or a regular expression. In the later case the regular expression must be indicated using the rx:
prefix.
NOTE! Please thread carefully here with your regex as more resources than expected may get excluded from the report via a loose regex rule. When your cluster resources change, this could lead to rendering sanitization sub-optimal. Once in a while it might be a good idea to run Popeye Config less
to make sure you are trapping any new issues with your clusters...
Here is an example spinach file as it stands in this release:
# A Popeye sample configuration file
popeye:
# Checks resources against reported metrics usage.
# If over/under these thresholds a sanitization warning will be issued.
# Your cluster must run a metrics-server for these to take place!
allocations:
cpu:
underPercUtilization: 200 # Checks if cpu is under allocated by more than 200% at current load.
overPercUtilization: 50 # Checks if cpu is over allocated by more than 50% at current load.
memory:
underPercUtilization: 200 # Checks if mem is under allocated by more than 200% at current load.
overPercUtilization: 50 # Checks if mem is over allocated by more than 50% usage at current load.
# Excludes section provides for excluding certain resources scanned by Poppeye.
excludes:
# Exclude any configmaps within namespace fred that ends with a version#
configmap:
- rx:fred*\.v\d+
# Exclude kube-system + any namespace the start with either kube or istio
namespace:
- kube-public
- rx:kube
- rx:istio
# Exclude node named n1 from the scan.
node:
- n1
# Exclude any pods that start with nginx or contains -telemetry
pod:
- rx:nginx
- rx:.*-telemetry
# Exclude any service containing -dash in their name.
service:
- rx:*-dash
# Configure node resources.
node:
# Limits set a cpu/mem threshold in % ie if cpu|mem > limit a lint warning is triggered.
limits:
# CPU checks if current CPU utilization on a node is greater than 90%.
cpu: 90
# Memory checks if current Memory utilization on a node is greater than 80%.
memory: 80
# Configure pod resources
pod:
# Restarts check the restarts count and triggers a lint warning if above threshold.
restarts:
3
# Check container resource utilization in percent.
# Issues a lint warning if about these threshold.
limits:
cpu: 80
memory: 75
The sanitizer report outputs each resource group scanned and their potential issues. The report is color/emoji coded in term of Sanitizer severity levels:
Level | Icon | Jurassic | Color | Description |
---|---|---|---|---|
Ok | β | OK | Green | Happy! |
Info | π | I | BlueGreen | FYI |
Warn | π± | W | Yellow | Potential Issue |
Error | π₯ | E | Red | Action required |
The heading section for each Kubenertes resource scanned, provides an issue rollup summary count for each of the categories above.
The Summary section provides a Popeye Score based on the sanitization pass on the given cluster.
This initial drop is brittle. Popeye will most likely blow up...
- You're running older versions of Kubernetes. Popeye works best Kubernetes 1.13+.
- You don't have enough RBAC fu to manage your cluster (see RBAC section)
This is work in progress! If there is enough interest in the Kubernetes community, we will enhance per your recommendations/contributions. Also if you dig this effort, please let us know that too!
Popeye sits on top of many of opensource projects and libraries. Our sincere appreciations to all the OSS contributors that work nights and weekends to make this project a reality!
- Email: fernand@imhotep.io
- Twitter: @kitesurfer
Β© 2019 Imhotep Software LLC. All materials licensed under Apache v2.0