kubernetes-sigs / scheduler-plugins

Repository for out-of-tree scheduler plugins based on scheduler framework.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Capacity Scheduler] Capacity scheduler won't preempt pods if all resource items are used

bfinta opened this issue · comments

Area

  • Scheduler
  • Controller
  • Helm Chart
  • Documents

Other components

No response

What happened?

Capacity scheduler won't preempt pods if all resource items are used because of https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/pkg/capacityscheduling/capacity_scheduling.go#L592

Scenario:
Cluster has 5 GPUs. Team A has the following elastic quota: gpu.min: 4, gpu.max: 5. Team B has gpu.min: 1, gpu.max: 5.
Team A runs a workload with 5 pods and it consumes all 5 GPUs. When Team B wants to run their workload even with 1 GPU, the pod stays in Pending, because
sum(quotas.used) + pod.requests > sum(quotas.min)
5 + 1 > 5

What did you expect to happen?

The scheduler should preempt pods until the other EQ's min is reached.

How can we reproduce it (as minimally and precisely as possible)?

No response

Anything else we need to know?

It would be great to be configured in the scheduler configuration file.

Kubernetes version

Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.1", GitCommit:"8f94681cd294aa8cfd3407b8191f6c70214973a4", GitTreeState:"clean", BuildDate:"2023-01-18T15:58:16Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.6", GitCommit:"b39bf148cd654599a52e867485c02c4f9d28b312", GitTreeState:"clean", BuildDate:"2022-09-21T13:12:04Z", GoVersion:"go1.18.6", Compiler:"gc", Platform:"linux/amd64"}

Scheduler Plugins version

registry.k8s.io/scheduler-plugins/controller:v0.24.9

I think it's part of original design to ensure the system's resource is not fully occupied by a single tenant/namespace.

@denkensk I mentioned this symptom to you the other day. Overall, I felt this is a bit counterintuitive as setting two tenants as 4/5 and 1/5 while having 5 in total sounds a common practice to allocate elastic quota. Could you help shed some light on the original design?