[Capacity Scheduler] Capacity scheduler won't preempt pods if all resource items are used
bfinta opened this issue · comments
Area
- Scheduler
- Controller
- Helm Chart
- Documents
Other components
No response
What happened?
Capacity scheduler won't preempt pods if all resource items are used because of https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/pkg/capacityscheduling/capacity_scheduling.go#L592
Scenario:
Cluster has 5 GPUs. Team A has the following elastic quota: gpu.min: 4, gpu.max: 5. Team B has gpu.min: 1, gpu.max: 5.
Team A runs a workload with 5 pods and it consumes all 5 GPUs. When Team B wants to run their workload even with 1 GPU, the pod stays in Pending, because
sum(quotas.used) + pod.requests > sum(quotas.min)
5 + 1 > 5
What did you expect to happen?
The scheduler should preempt pods until the other EQ's min is reached.
How can we reproduce it (as minimally and precisely as possible)?
No response
Anything else we need to know?
It would be great to be configured in the scheduler configuration file.
Kubernetes version
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.1", GitCommit:"8f94681cd294aa8cfd3407b8191f6c70214973a4", GitTreeState:"clean", BuildDate:"2023-01-18T15:58:16Z", GoVersion:"go1.19.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.6", GitCommit:"b39bf148cd654599a52e867485c02c4f9d28b312", GitTreeState:"clean", BuildDate:"2022-09-21T13:12:04Z", GoVersion:"go1.18.6", Compiler:"gc", Platform:"linux/amd64"}
Scheduler Plugins version
I think it's part of original design to ensure the system's resource is not fully occupied by a single tenant/namespace.
@denkensk I mentioned this symptom to you the other day. Overall, I felt this is a bit counterintuitive as setting two tenants as 4/5 and 1/5 while having 5 in total sounds a common practice to allocate elastic quota. Could you help shed some light on the original design?