alemansec / kubernetes-arm-sandbox

playing with kubernetes on 4x Raspberry pi 3B machines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About

Personal "iterative" kubernetes sandbox used while playing with kubernetes at home on top of raspberry pi 3B machines.

We'll first manually setup a single-master kubernetes cluster on top of 4x Raspberry pi 3B (our fifth raspberry pi ethernet device died recently).

Because one raspberry pi 3B can barely handle the load generated by 4 worker nodes calls to apiserver (no comment), we'll later switch to a multi-master cluster, composed of :

  • 3 master nodes
  • 1 worker node (later on we'll add amd64 nodes to the cluster, re-entering the painful world of docker and multi-arch images :p)
  • 2x HAproxy + VRRP to provide HA for control plane nodes

references :

iterations

  • step 1 : basic manual cluster setup, in single-master mode with a nginx ingress controller + metalLB and a default nginx service in spec.type: LoadBalancer mode as a proof of concept.

  • step 2 : using ansible to create the same cluster automatically.

    • installing docker-ce on arm+amd64 machines,
    • configuring networking/VLANs on cluster members
    • installing a local docker registry on ARM machines, reconfiguring docker engines to use this registry as a registry-mirror, with self-signed ssl/tls certs. This registry will not run on top of the cluster (chicken and egg problem) but on a single ARM device instead.
  • step 3 : setting up custom namespaces, network policies, roles (rbac) and better test service using homemade docker images

  • step 4 : migrate our manually-created test service to our local gitlab-ce installation, with gitlab-ci integration. (or better, switch to jenkins-x )

Step 1

Hosts

In the first iteration/step, we'll use following raspberry pi hosts :

  • pi01.p13.p.s18m2.com 10.13.1.21
  • pi02.p13.p.s18m2.com 10.13.1.22
  • pi03.p13.p.s18m2.com 10.13.1.23
  • pi04.p13.p.s18m2.com 10.13.1.24 (master node)

One single kubernetes master node for now.

because i want to bring both ARM and amd64 machines into the mix, i use ansible to provision 4 extra amd64 virtual machines using :

this adds :

  • vk8s01.p13.p.s18m2.com
  • vk8s02.p13.p.s18m2.com
  • vk8s03.p13.p.s18m2.com
  • vk8s04.p13.p.s18m2.com to the mix

hosts preparation

Note : i won't add this ansible part to the repository, for multiple reasons (those initial playbooks and roles are quite ugly and still contain many hardcoded values, plus ansible will probably get dozens of deprecation warnings then errors in a few weeks, as usual...)

configure local nameservers and networking

We setup two local nameservers to provide host name resolution for our sandbox zone "p13.p.s18m2.com", extra vlan configuration is also applied using this playbook, so our raspberry pi hosts can sit on various networks the way we like :

  ansible-playbook -i inventory/bootstrap/ _bootstrap.yml

  # limit to bind :
  ansible-playbook -i inventory/bootstrap/ _bootstrap.yml --tags bind

optional - purge previous docker remains

docker system prune --volumes
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
iptables -t nat -F
iptables -t mangle -F
iptables -F
iptables -X

ip6tables -P INPUT ACCEPT
ip6tables -P FORWARD ACCEPT
ip6tables -P OUTPUT ACCEPT
ip6tables -t nat -F
ip6tables -t mangle -F
ip6tables -F
ip6tables -X

install docker-ce

Basic docker-ce setup on amd64+armhf(armv7 only for now, not using aarch64 here yet), plus ssl/tls certificates setup related to our local docker registry (also running on arm devices) :

  cd ansible/
  ansible-playbook -i inventory/sandbox/ book_docker_engine.yml

disable swap

  dphys-swapfile swapoff
  dphys-swapfile uninstall
  update-rc.d dphys-swapfile remove

cgroups

  # append to /boot/cmdline.txt :
  cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory swapaccount=1

Kubernetes setup using kubeadm

install kubeadm

  curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
  # yes, this shows "xenial" although we're running raspbian on our raspberry pi machines:
  echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
  apt-get update && apt-get -y install kubeadm

kubernetes root dir

we'll use a separate partition (hopefully faster than internal mmc card) for the whole set of kubernetes tools.

we'll use /opt/hosting/infra/k8s ; /opt/hosting/infra being already mounted locally

  # on al nodes :
  mkdir /opt/hosting/infra/k8s
  sed -i 's#KUBELET_EXTRA_ARGS=#KUBELET_EXTRA_ARGS="--root-dir=/opt/hosting/infra/k8s"#g' /etc/default/kubelet

(our docker-engine daemons also uses a separate partition on another device than the internal mmc card).

master node init

on future master node (pi04.p13.p.s18m2.com in our case), as root :

  # pre-pull required docker images to run a k8s master node :
  kubeadm config images pull

We will use 'flannel' networking, so we have to run (10.244.0.0/16 being the default flannel cidr) :

  # as root, on pi04 (future master node) :
  # 10.13.1.24 = pi04, our future master node (where we run those commands) :
  kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.13.1.24

or better : use this script in the first place to kubeadm init the cluster : (chicken and egg problem on slower machines : kubeadm init creates the manifests containing .. the timeouts values we need to increase..)

#!/usr/bin/env python
#
# replace some startup timeouts when manifests files are created by kubeadm init
# see https://github.com/kubernetes/kubeadm/issues/413
# run me using 'sudo python foobar.py'
#
import os
import time
import threading

filepath = '/etc/kubernetes/manifests/kube-apiserver.yaml'

def replace_defaults():
    print('Thread start looking for the file')
    while not os.path.isfile(filepath):
        time.sleep(1) #wait one second
    print('\033[94m -----------> FILE FOUND: replacing defaults \033[0m')
    os.system("""sed -i 's/failureThreshold: [0-9]/failureThreshold: 18/g' /etc/kubernetes/manifests/kube-apiserver.yaml""")
    os.system("""sed -i 's/timeoutSeconds: [0-9][0-9]/timeoutSeconds: 20/g' /etc/kubernetes/manifests/kube-apiserver.yaml""")
    os.system("""sed -i 's/initialDelaySeconds: [0-9][0-9]/initialDelaySeconds: 240/g' /etc/kubernetes/manifests/kube-apiserver.yaml""")

t = threading.Thread(target=replace_defaults)
t.start()
os.system("kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.13.1.24")

as a regular user on same node (future master node, pi04) (as told by kubeadm init) :

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

write join command somewhere, here's a sample :

kubeadm join 10.13.1.24:6443 --token ymduph.d5tuo85q088e1k72 --discovery-token-ca-cert-hash sha256:e568747339f7deaf29b6777d9bffd52766bf5fee22e0b94394bf4f49b9dcdb03

on master node (pi04), as the regular user with .kube/config (or on your desktop machine if you copied .kube/config under ~/.kube/) :

  #kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
  # sed -i 's/amd64/arm/g' ./kube-flannel.yml
  kubectl apply -f ./kube-flannel.yml
  kubectl get pods --namespace=kube-system

on each node, including master node :

  #sudo sysctl net.bridge.bridge-nf-call-iptables=1
  sudo echo "net.bridge.bridge-nf-call-iptables = 1" > /etc/sysctl.d/k8s.conf
  sudo sysctl -p /etc/sysctl.d/k8s.conf

as root, on every other node :

  # run "kubeadm token create --print-join-command" as root on master node to get a new token if lost :
  kubeadm join 10.13.1.24:6443 --token ymduph.d5tuo85q088e1k72 --discovery-token-ca-cert-hash sha256:e568747339f7deaf29b6777d9bffd52766bf5fee22e0b94394bf4f49b9dcdb03

next - basic application + service with LoadBalancer type on baremetal (arm)

exposing the service onto an external ip address :

To be able to use the spec.type: LoadBalancer, we install an ingress controller (nginx in our case).

We also setup metallb, so we get dynamically allocated 'external' ip addresses from configured pool.

https://kubernetes.github.io/ingress-nginx/deploy/

MetalLB

MetalLB controller needs to be installed, so our exposed services (with type LoadBalancer) gets external IP addresses:

  # https://metallb.universe.tf/installation/
  #kubectl apply -f https://raw.githubusercontent.com/google/metallb/v0.7.3/manifests/metallb.yaml
  kubectl apply -f ./ingress/metallb/metallb.yaml
  # MetalLB’s components will still start, but will remain idle until you define and deploy a configmap :
  kubectl apply -f ./ingress/metallb/layer2-config.yaml

Optional - requesting specific IPaddr

https://metallb.universe.tf/usage/#requesting-specific-ips

"MetalLB respects the spec.loadBalancerIP parameter, so if you want your service to be set up with a specific address, you can request it by setting that parameter."

Optional - requesting a specific address pool

MetalLB also supports requesting a specific address pool, if you want a certain kind of address but don’t care which one exactly. To request assignment from a specific pool, add the metallb.universe.tf/address-pool annotation to your service, with the name of the address pool as the annotation value. For example:

apiVersion: v1
kind: Service
metadata:
  name: nginx
  annotations:
    metallb.universe.tf/address-pool: production-public-ips
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: nginx
  type: LoadBalancer

('production-public-ips' pool beeing defined in ingress/metallb/layer2-config.yaml)

traefik ingress controller

because ingress-nginx does not work on ARM arch at the moment (see below), i'll switch to traefik.

  # RBAC ClusterRoleBinding:
  kubectl -f ingress/traefik/traefik-rbac.yaml

  # deployment :
  kubectl apply -f ingress/traefik/traefik-deployment.yaml

  # http basic auth for traefik dashboard access:
  # (secret has to be create in same namespace as ingress object)
  htpasswd -c ./traefik-admin-auth someusername
  kubectl create secret generic traefik-dashboard-basic-auth --from-file ./traefik-admin-auth --namespace kube-system

  # service and ingress to expose traefik web ui :
  # openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout traefik.p13.p.s18m2.com.key -out traefik.p13.p.s18m2.com.crt -subj "/CN=traefik.p13.p.s18m2.com"
  # kubectl -n kube-system create secret tls traefik-ui-tls-cert --key=traefik.p13.p.s18m2.com.key --cert=traefik.p13.p.s18m2.com.crt
  kubectl -n kube-system create secret tls traefik-ui-tls-cert --key=./certificates/certs/traefik.p13.p.s18m2.com.key --cert=./certificates/certs/traefik.p13.p.s18m2.com.crt
  kubectl apply -f ingress/traefik/ui.yaml

nginx ingress controller

https://github.com/nginxinc/kubernetes-ingress/blob/master/docs/installation.md

Note that we had to use another image, because of our ARM arch, hence the use of ./ingress/nginx/mandatory.yaml

# amd64 only: kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/mandatory.yaml
# arm:
kubectl apply -f ./ingress/nginx/mandatory.yaml
# to see progress :
kubectl describe pods -n ingress-nginx

start our nginx ingress controller in loadBalancer mode so it gets an external ip :

  kubectl create -f ingress/nginx/service-loadbalancer.yaml

nginx ingress controller gets assigned an IPaddress provided by METALLB.

Later on, Services (not requiring to use spec.loadBalancer anymore) are automatically made available externally by nginx ingress controller via Kind: Ingress rules.

we can still expose services directly without Ingress rules, by using spec.type: loadBalancer for those services.

Running into this issue : kubernetes/ingress-nginx#3545 (crashes after 1 or two hits)

I0205 07:58:13.441048       7 controller.go:195] Backend successfully reloaded.
I0205 07:58:13.451748       7 controller.go:212] Dynamic reconfiguration succeeded.
192.168.100.4 - [192.168.100.4] - - [05/Feb/2019:07:58:18 +0000] "GET / HTTP/1.1" 302 31 "-" "curl/7.62.0" 83 0.007 [default-gogs-ui-3000] 10.244.3.24:3000 31 0.010 302 8597a5ef8706a4416d94bb36e59f035f
192.168.100.4 - [192.168.100.4] - - [05/Feb/2019:07:58:19 +0000] "GET / HTTP/1.1" 302 31 "-" "curl/7.62.0" 83 0.015 [default-gogs-ui-3000] 10.244.3.24:3000 31 0.010 302 9fe59e2fc2e523e5fcfaea8326552cac
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x11a70]

goroutine 256 [running]:
runtime/internal/atomic.goXadd64(0x28e44bc, 0x2, 0x0, 0x3126e979, 0x3f7cac08)
    /usr/local/go/src/runtime/internal/atomic/atomic_arm.go:96 +0x1c
k8s.io/ingress-nginx/vendor/github.com/prometheus/client_golang/prometheus.(*histogram).Observe(0x28e4460, 0x3126e979, 0x3f7cac08)
    /go/src/k8s.io/ingress-nginx/vendor/github.com/prometheus/client_golang/prometheus/histogram.go:272 +0x68
k8s.io/ingress-nginx/internal/ingress/metric/collectors.(*SocketCollector).handleMessage(0x2be1780, 0x2fe8000, 0x14a, 0x600)
    /go/src/k8s.io/ingress-nginx/internal/ingress/metric/collectors/socket.go:269 +0xb8c

test service

nginx hello world with spec.type: LoadBalancer (dynamically getting an external ip address + Loadbalancer)

  1. create nginx deployment :

this creates a nginx pod, only accessible from within the cluster. (using 'nginx:stable' image, working properly on arm arch)

  kubectl create -f ./use/deployments/helloworld-nginx.yml
  kubectl get pods -o wide
  1. exposing service to whole cluster :

we create a service, exposing the nginx application within the entire cluster.

either :

  kubectl expose deployment/my-nginx

or

  kubectl create -f ./use/services/helloworld-svc-nginx.yml
  kubectl get svc -o wide my-nginx
  kubectl describe svc my-nginx

  kubectl scale --current-replicas=2 --replicas=1 deployment/my-nginx

See if metallb did what it was supposed to do :

$ kubectl get svc -o wide
NAME         TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)        AGE    SELECTOR
kubernetes   ClusterIP      10.96.0.1       <none>            443/TCP        6h7m   <none>
my-nginx     LoadBalancer   10.101.142.20   192.168.100.240   80:30020/TCP   16m    run=my-nginx

We got an 'EXTERNAL-IP' assigned, in the range configured within metallb config :p

me@machine_external_to_cluster$ curl -v http://192.168.100.240/
*   Trying 192.168.100.240...
* TCP_NODELAY set
* Connected to 192.168.100.240 (192.168.100.240) port 80 (#0)
> GET / HTTP/1.1
> Host: 192.168.100.240
> User-Agent: curl/7.62.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.14.2
< Date: Thu, 31 Jan 2019 10:56:08 GMT
< Content-Type: text/html
< Content-Length: 612
< Last-Modified: Tue, 04 Dec 2018 14:44:49 GMT
< Connection: keep-alive
< ETag: "5c0692e1-264"
< Accept-Ranges: bytes
<
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
...

(i can't use/validate ingress-nginx right now, because it crashes on arm right now (in its 0.20 version that is, the latest available atm).

documentations / links

dashboard

references/doc :

  kubectl apply -f ./dashboard/kubernetes-dashboard.yaml
  # then, to access the dashboard :
  # (for example, i'm executing ```kubectl proxy``` on my desktop machine previously configured to access the cluster using kubectl
  kubectl proxy

  # create ServiceAccount :
  kubectl create -f ./dashboard/serviceaccount.yml

  # clusterrole binding:
  kubectl apply -f ./dashboard/clusterrole_binding.yml

  # retrieve access token using :
  kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')

  # visit http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
  # paste the 'token' value of previous command to login

dashboard nodes section

Metrics :

ref: https://github.com/kubernetes/dashboard/wiki/Integrations

cert-manager

@TODO use this with letsencrypt - similar to traefik's letsencrypt support in terms of features (should also be possible to "mix" both letsencrypt and self-signed certs, like traefik does)

--> no official ARM support yet :

allow pods on master nodes

optional : if you want to allow kubernetes master node to run tasks

  # remove 'master' role from master nodes, so they also accept tasks:
  kubectl taint nodes --all node-role.kubernetes.io/master-

kubernetes commands

  kubectl get pods --all-namespaces

  kubectl get pods --namespace=kube-system

  kubectl get nodes

  kubectl logs --tail=50 kube-flannel-ds-arm-rdk7f -n kube-system

  # list services in kube-system namespace :
  kubectl get svc -n kube-system

  kubectl apply -f <file.yml | url_to_yml>
  kubectl delete -f <file.yml | url_to_yml>

  # basic health report :
  kubectl get cs

  # basic :
  kubectl run appname --image=myimage:tag

  # useful when in 'creating' status (while no logs are available yet) :
  kubectl describe pods -n ingress-nginx

teardown

kubectl drain pi03.p13.p.s18m2.com --delete-local-data --force --ignore-daemonsets ; kubectl delete node pi03.p13.p.s18m2.com
kubectl drain pi04.p13.p.s18m2.com --delete-local-data --force --ignore-daemonsets ; kubectl delete node pi04.p13.p.s18m2.com
kubectl drain pi01.p13.p.s18m2.com --delete-local-data --force --ignore-daemonsets ; kubectl delete node pi01.p13.p.s18m2.com
kubectl drain pi02.p13.p.s18m2.com --delete-local-data --force --ignore-daemonsets ; kubectl delete node pi02.p13.p.s18m2.com

# on each node
kubeadm reset
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
ipvsadm -C

https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#tear-down

Tear down

To undo what kubeadm did, you should first drain the node and make sure that the node is empty before shutting it down.

Talking to the master with the appropriate credentials, run:

kubectl drain --delete-local-data --force --ignore-daemonsets kubectl delete node

Then, on the node being removed, reset all kubeadm installed state:

kubeadm reset

The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually:

iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If you want to reset the IPVS tables, you must run the following command:

ipvsadm -C

If you wish to start over simply run kubeadm init or kubeadm join with the appropriate arguments.

More options and information about the kubeadm reset command

kubectl on a machine external to the cluster

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# yes, this line shows "xenial" although we're running debian stretch on our desktop machine :
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
apt-get update && apt-get -y install kubectl
  me@desktop$ mkdir ~/.kube
  me@desktop$ scp pi04:/home/lonelyone/.kube/config $HOME/.kube/

  me@desktop$ kubectl get pods --all-namespaces
  NAMESPACE        NAME                                           READY   STATUS    RESTARTS   AGE
  default          my-nginx-7fd64b656c-j78sc                      1/1     Running   0          42m
  default          my-nginx-7fd64b656c-pjjmb                      1/1     Running   0          42m
  ingress-nginx    nginx-ingress-controller-594f658645-nhxg2      1/1     Running   0          53m
  kube-system      coredns-86c58d9df4-kjgl7                       1/1     Running   0          6h30m
  kube-system      coredns-86c58d9df4-znd5j                       1/1     Running   0          6h30m
  kube-system      etcd-pi04.p13.p.s18m2.com                      1/1     Running   0          6h30m
  kube-system      kube-apiserver-pi04.p13.p.s18m2.com            1/1     Running   1          6h31m
  kube-system      kube-controller-manager-pi04.p13.p.s18m2.com   1/1     Running   0          6h30m
  kube-system      kube-flannel-ds-arm-46bxm                      1/1     Running   1          6h26m
  kube-system      kube-flannel-ds-arm-lg69b                      1/1     Running   1          6h26m
  kube-system      kube-flannel-ds-arm-lrj2n                      1/1     Running   0          6h28m
  kube-system      kube-flannel-ds-arm-wtcxj                      1/1     Running   1          6h27m
  kube-system      kube-proxy-c5cjn                               1/1     Running   0          6h26m
  kube-system      kube-proxy-cts58                               1/1     Running   0          6h27m
  kube-system      kube-proxy-k758k                               1/1     Running   0          6h26m
  kube-system      kube-proxy-lncbd                               1/1     Running   0          6h30m
  kube-system      kube-scheduler-pi04.p13.p.s18m2.com            1/1     Running   0          6h31m
  metallb-system   controller-7cc9c87cfb-b6m55                    1/1     Running   0          74m
  metallb-system   speaker-2gmfk                                  1/1     Running   0          74m
  metallb-system   speaker-h67f8                                  1/1     Running   0          74m
  metallb-system   speaker-wh5ww                                  1/1     Running   0          74m

multi-master setup

refs:

environments

namespaces and dns

When you create a Service, it creates a corresponding DNS entry. This entry is of the form <service-name>.<namespace-name>.svc.cluster.local ("cluster.local" beeing the default cluster name created by kubeadm init), which means that if a container just uses <service-name> it will resolve to the service which is local to a namespace.

Containers grouped together in the same pod may use localhost if tight coupling is needed.

monitoring

  • prometheus

various tools/links

Kubernetes Best practices

This sample sandbox violates quite a few production rules ; here is a kubernetes best practices talk presenting a few kubernetes best practices (it starts slowly,but finally gets into interesting/useful points) :

Step XX - deploying gogs on kubernetes

see gogs/README.md

WARNING : this is the first attempt, it does not (yet) use secrets, uses an emptyDir storage (meaning you'll lose everything on teardown).. but at least it works with separate containers/pods and is quite light as opposed to gitlab-ce.

i'll configure persistent volumes using my local glusterfs installation (also running on Raspberry pi machines)

About

playing with kubernetes on 4x Raspberry pi 3B machines


Languages

Language:HTML 92.0%Language:CSS 7.4%Language:Python 0.6%Language:JavaScript 0.0%