iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

Home Page:http://iree.dev/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Runtime CPU load balance strategy for local-task

bhbruce opened this issue · comments

Background

I compiled my models to deploy on platforms with 2 CPUs. The vmfb ran with --task_topology_group_count=2.

Observation

Two workers equally share the dispatch tasks.
If one worker completes its tasks, it waits for another worker to complete its tasks.
In other words, one worker is spared until another one completes the tasks.

For instance, I have a matmul op that is divided into 48 dispatch tasks. Both workers are responsible for 24 tasks each. They do not help each other.
image

Based on the image, it is evident that worker-1 is idle, while worker-2 still has 5 remaining dispatch tasks.
It makes CPU usage drop to 55%.

Question

Does this observation align with our expectations?
Why doesn't worker-1 assist worker-2 in completing the remaining 5 tasks?

FYR. @rednoah91

@benvanik Do you know if this case make sense?