kubernetes-retired / kube-batch

A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong condition for reclaim action

zionwu opened this issue · comments

Is this a BUG REPORT or FEATURE REQUEST?:
/kind bug

What happened:

In reclaim action https://github.com/kubernetes-sigs/kube-batch/blob/master/pkg/scheduler/actions/reclaim/reclaim.go#L151, the following is wrong:

	if allRes.Less(resreq) {
		glog.V(3).Infof("Not enough resource from victims on Node <%s>.", n.Name)
		continue
	}

This means if all of mem/cpu/gpu of allRes are less than resreq, then it will not process to reclaim the resources. However, the condition should be if any of mem/cpu/gpu of allRes are less than resreq.

it should be:

	if !resreq.LessEqual(allRes) {
		glog.V(3).Infof("Not enough resource from victims on Node <%s>.", n.Name)
		continue
	}

It is the same for preempt action, I was going to write a fix.