Only adding private repository to pod resource cause the deployment resource create pod failed from pod template.
lyyao09 opened this issue · comments
First, deploying kubenab in the kubernetes cluster can solve the problem that the image of the yaml file does not have a private repository. eg:
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-pod
spec:
replicas: 1
selector:
matchLabels:
app: deployment
template:
metadata:
labels:
app: deployment
spec:
containers:
- image: busybox(kubenab will add private repository automatically when create yaml)
name: busybox
However, I found that there is a problem in the case of power off the node.
Suppose test-pod first scheduled on node1(kubenab is also scheduled on the node1). If node1 is powered off, test-pod and kubenab will migrate to other nodes after a while.
If the test-pod migrate first and kubenab is not ready, it will pull the image busybox:latest from default docker.io repository, and then the pod become ErrorImagePull(I have to use kubectl delete -f xxx.yaml cmd and kubectl create -f xxx.yaml after kubenab is ready to avoid this problem).
kubectl describe pod deployment-test-pod-567d7cf9bf-xdfvg
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 20s default-scheduler Successfully assigned default/deployment-test-pod-567d7cf9bf-xdfvg to node2
Normal Pulling 13s kubelet, node1 Pulling image "busybox:latest"
Warning Failed 13s kubelet, node1 Failed to pull image "busybox:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:35116->[::1]:53: read: connection refused
Warning Failed 13s kubelet, node1 Error: ErrImagePull
From kubenab/cmd/kubenab/admissions.go, kubenab only add repository to Pod resource, when pod created by deployment or statefulset's template, the above problem will occur because there is no private repository added to deployment or statefulset resource.
if !contains(whitelistedNamespaces, namespace) {
pod := v1.Pod{}
Please confirm if there is any problem or my usage is wrong?(use official example)
This issue could be solved by setting the Pod Priority to system-cluster-critical
since kubenab
is relevant that the Cluster can work correctly.
I created a Pull-Request (#18) which targets your problem and solves.
I wasn't able to reproduce your problem with that patch applied.
Please note that this Patch (#18) does not implement Fail-Over/High-Availability – it only sets the Scheduling Priority to the Highest possible Value so the kubenab Pod would be scheduled above all else Pods!
This is targeted by PR #12 (which is currently Work in Progress).
But you could also simply increase the Replica Count of the kubenab
deployment and then add a nodeSelector to prevent scheduling all the pods on the same Node.
@l0nax, Thank you for your prompt reply.
According to the method you provided and found that the problem still exists when abnormal power off node.
I think the root cause is that only the image name in the pod object(Kind: Pod) is modified and persisted(Not include deployment or statefuelset).
K8s version:
[root@node2 01-volume-emptyDir]# kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:23:26Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:15:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
You are right.
The MutatingAdmissionController gets only called on CREATE
Operations (see https://github.com/jfrog/kubenab/blob/master/chart/kubenab/templates/mutating-webhook.yaml#L16).
I will push a change so also UPDATE
Operations will be covered by the Admission Controller.
Thanks for your futture Tests @lyyao09
EDIT:
Adding UPDATE
Operation to the Admission Controller could also prevent Issues if a user (eg.) changes only the Image Version of an Pod.
@lyyao09 can you re-open the Issue please?
And your Issue would be fixed via Fail-Over (whitch targets the PR #12) this would prevent issues of a Node gets a abnormal power off.
Editing Deployments/ StatefSets could be done, but they will create Pods and you could also create Pods without a Deployment or StatefulSet.
So it would sub-optimal to change them.
A pod is the "lowest" entity which will be created.
You are right.
The MutatingAdmissionController gets only called onCREATE
Operations (see https://github.com/jfrog/kubenab/blob/master/chart/kubenab/templates/mutating-webhook.yaml#L16).I will push a change so also
UPDATE
Operations will be covered by the Admission Controller.Thanks for your futture Tests @lyyao09
EDIT:
AddingUPDATE
Operation to the Admission Controller could also prevent Issues if a user (eg.) changes only the Image Version of an Pod.
yes, we should add UPDATE
operation as you say. However, this can't solve my problem.
As said in #17 (comment) Fail-Over/ Replication is the only way to fix this problem directly.
And priority Classes are only to set the Scheduling Priority above all the other Pods. But kubernetes has a "delay" after which kubernetes marks a Nice as Unschedulable and NotRead
@l0nax, I simply increase the Replica Count of the kubenab deployment and then add a podAntiAffinity
to prevent scheduling all the pods on the same Node.
Repeated abnormal power off node test, it works well.
How does your podAffinity look like?
If you add the Node-Names hard-coded you prevent Kubernetes to dynamically schedule the Pods
as far as I know kubernetes doesn't have a feature where you can say "deploy max 2 pods per node".
You could implement this feature by building a Operator which schedules the Pods dynamically.
How does your podAffinity look like?
If you add the Node-Names hard-coded you prevent Kubernetes to dynamically schedule the Pods
I make kubenab scheduled to 3 master node, and each node only have one kubenab pod.
...
nodeSelector:
role: master
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- kubenab
topologyKey: kubernetes.io/hostname
...
@lyyao09 Sorry for the late answer. I didn't know about how to use podAntiAffinity
for schedule only per node.
I will make a Change to implement that Feature - after understanding how really it works.
Thanks!
@l0nax regarding podAntiAffinity
it is nothing to implement it is already there:
https://github.com/jfrog/kubenab/blob/master/chart/kubenab/values.yaml#L132
https://github.com/jfrog/kubenab/blob/master/chart/kubenab/templates/deployment.yaml#L57
:)
one of examples of adding it to override-values.yaml
file:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: helm.sh/chart
operator: In
values:
- kubenab
topologyKey: "kubernetes.io/hostname"
any of these labels can be used for podAntiAffinity
@rimusz oh sorry I haven't seen that