[Ray scheduling] The memory already used on the Worker Node needs to be taken into account when scheduling Ray tasks

Question

[Ray scheduling] The memory already used on the Worker Node needs to be taken into account when scheduling Ray tasks

yx367563 opened this issue 21 days ago · comments

Description

Currently, when Ray schedules a task, it only takes into account the memory resources requested by the user in the option.
This can lead to the possibility of scheduling multiple memory-hungry tasks to a single worker node if the task doesn't specify a memory parameter, which can trigger an OOM. Even with the retry mechanism, there's no way to ensure which worker node will be allocated in the next scheduling.
So it needs to take into account the actual memory resources already used on the worker node.

Use case

Users may not set the memory parameter when submitting a Ray Task, or they may not be sure how much memory the task will consume when running. Therefore, if we only rely on the memory requested by user for scheduling, it is very likely that multiple tasks with high memory consumption will be scheduled to a single Worker Node. After an OOM retry is triggered, the task may continue to be scheduled to the original Worker Node because the requested memory parameters have not changed.

Possible Solution: Consider the memory already used on the Worker Node when scheduling the task, and expand the memory requested by the task when triggering the OOM retry in conjunction with the size of the memory it previously used.