Using `limits` with the shim makes the pod fail.

Question

Using `limits` with the shim makes the pod fail.

mikkelhegn opened this issue 7 months ago · comments

Mikkel Mork Hegnhoj commented 7 months ago

The livenessProbe reports failure continuously. Not sure if the pod is restarted because of that, but that it actually runs, or what the problem is.

Repro using k3d

k3d cluster create wasm-cluster \
           --image ghcr.io/deislabs/containerd-wasm-shims/examples/k3d:v0.10.0 \
           -p "8081:80@loadbalancer" \
           --agents 0

kubectl apply -f https://raw.githubusercontent.com/deislabs/containerd-wasm-shims/main/deployments/workloads/runtime.yaml

Then apply the following workloads for comparison:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: fails
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fails
  template:
    metadata:
      labels:
        app: fails
    spec:
      runtimeClassName: wasmtime-spin
      containers:
        - name: fails
          image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:latest
          command: ["/"]
          resources:
            limits:
              cpu: 100m
              memory: 128Mi
            requests:
              cpu: 100m
              memory: 128Mi
          livenessProbe:
            httpGet:
              path: .well-known/spin/health
              port: 80
            initialDelaySeconds: 3
            periodSeconds: 3
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: works
spec:
  replicas: 1
  selector:
    matchLabels:
      app: works
  template:
    metadata:
      labels:
        app: works
    spec:
      runtimeClassName: wasmtime-spin
      containers:
        - name: works
          image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:latest
          command: ["/"]
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
          livenessProbe:
            httpGet:
              path: .well-known/spin/health
              port: 80
            initialDelaySeconds: 3
            periodSeconds: 3

Justin Pflueger · Answer 1 · Thu Dec 14 2023 12:49:04 GMT+0800 (China Standard Time)

Just wanted to add my findings in here as well. It seems like there is a CPU spike during startup that is throttled by the resource limits. This might not be specific to the shim but a general issue with resource limits in Kubernetes. For example, I used the following two deployments to check how long it took for Spin's port to be opened and with higher or no limits on the pod it does open the port in a relatively short time.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: spin-slow-start
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spin-slow-start
  template:
    metadata:
      labels:
        app: spin-slow-start
    spec:
      runtimeClassName: wasmtime-spin
      containers:
        - name: spin-hello
          image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:v0.10.0
          command: ["/"]
          resources:
            limits:
              cpu: 100m
              memory: 128Mi
            requests:
              cpu: 100m
              memory: 128Mi
        - image: alpine:latest
          name: debug-alpine
          command: ["/bin/sh", "-c"]
          args:
            - |
              TARGET_HOST='127.0.0.1'

              echo "START: waiting for $TARGET_HOST:80"
              timeout 60 sh -c 'until nc -z $0 $1; do sleep 1; done' $TARGET_HOST 80
              echo "END: waiting for $TARGET_HOST:80"

              sleep 100000000
          resources: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spin-faster-start
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spin-faster-start
  template:
    metadata:
      labels:
        app: spin-faster-start
    spec:
      runtimeClassName: wasmtime-spin
      containers:
        - name: spin-hello
          image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:v0.10.0
          command: ["/"]
          resources:
            limits:
              cpu: 400m
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 128Mi
        - image: alpine:latest
          name: debug-alpine
          command: ["/bin/sh", "-c"]
          args:
            - |
              TARGET_HOST='127.0.0.1'

              echo "START: waiting for $TARGET_HOST:80"
              timeout 60 sh -c 'until nc -z $0 $1; do sleep 1; done' $TARGET_HOST 80
              echo "END: waiting for $TARGET_HOST:80"

              sleep 100000000
          resources: {}

Maybe the fix here is to just remove the limits from example deployments or bump them up? We could also evaluate adding overhead.podCpu configuration to the runtime class to ensure the limits are tolerant of spikes, though that might impact the ability to schedule the pods on smaller nodes.

James Sturtevant · Answer 2 · Fri Dec 15 2023 01:24:00 GMT+0800 (China Standard Time)

@mikkelhegn could you check to see if higher limits helps?

You might also play with livenessProbe settings. Delaying the call a few more seconds during the spike of the initial boot might help too.

initialDelaySeconds: 10
periodSeconds: 3

Mikkel Mork Hegnhoj · Answer 3 · Fri Dec 15 2023 19:10:01 GMT+0800 (China Standard Time)

I had to bump initialDelaySeconds to 45 sec to not have the livenessProbe fail.