[feature] Increase the load balancing strategy of the machine

Question

[feature] Increase the load balancing strategy of the machine

smallzhongfeng opened this issue a year ago · comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Since the current Slots allocation strategy only compares the remaining slots of each executor when sorting, we can add a load balancing comparison method, because the models of each executor are different, and it is possible that machines with better performance can run more task

Describe the solution you'd like
I think our better way is to obtain cpu, memory, disk and other statistical information to make a weighted average, and then get a score, based on this score to determine the priority of assigning tasks to a certain executor

jokercurry · Answer 1 · Fri Jun 30 2023 18:22:50 GMT+0800 (China Standard Time)

cc plz~ @thinkharderdev @avantgardnerio

yahoNanJing · Answer 2 · Fri Jun 30 2023 19:52:36 GMT+0800 (China Standard Time)

How about using different task slots for different executors?

jokercurry · Answer 3 · Fri Jun 30 2023 21:58:32 GMT+0800 (China Standard Time)

Sorry.... I can't get your point. Could you be more specific about this strategy?

Dan Harris · Answer 4 · Sat Jul 01 2023 01:45:49 GMT+0800 (China Standard Time)

Sorry.... I can't get your point. Could you be more specific about this strategy?

I believe he means that the number of task slots is configured on a per-executor basis. When the executor registers with the scheduler it informs the scheduler how many available task slots it has and the scheduler tracks task slots at the executor level so there is no need for all executors to have the same number of slots. If you have executors which can handle more concurrency then you can just configure those executors to register with more task slots.

jokercurry · Answer 5 · Sat Jul 01 2023 17:50:54 GMT+0800 (China Standard Time)

Thanks for your reply! I got it, but let me give an extreme example, if executor1 on host1 has one core and is configured with 10 slots, and another executor2 on host2 has 10 cores and is configured with 2 slots, will executor1 be given priority to allocate 10 slots, but according to In terms of machine performance, executor2 should be assigned with higher priority, right? @thinkharderdev @yahoNanJing

Daniël Heres · Answer 6 · Sat Jul 01 2023 17:58:03 GMT+0800 (China Standard Time)

@smallzhongfeng

So the general idea is to allocate slots in accordance to the number of cores:

executor1 has 1 core -> configure this executor with 1 slot
executor2 has 10 cores -> configure this executor with 10 slots

Daniël Heres · Answer 7 · Sat Jul 01 2023 18:04:17 GMT+0800 (China Standard Time)

The executors themselves report back on completed tasks, so if a slot is free it will be filled whenever a new task is available.

jokercurry · Answer 8 · Sat Jul 01 2023 18:34:33 GMT+0800 (China Standard Time)

@smallzhongfeng

So the general idea is to allocate slots in accordance to the number of cores:

executor1 has 1 core -> configure this executor with 1 slot executor2 has 10 cores -> configure this executor with 10 slots

Yes, if I have read the code, I will configure it like this, but for users, if they don’t care about the configuration, I think we can have an adaptive ability, for example, we can automatically obtain the number of cores of the current machine, and then set A ratio is obtained by dividing the configured slot and the number of cores, and the priority task with a higher ratio is assigned to this batch of executors for execution.

jokercurry · Answer 9 · Sat Jul 01 2023 18:35:53 GMT+0800 (China Standard Time)

If we have many machines, but different models, it is impossible for us to manually modify the configuration.

Dan Harris · Answer 10 · Sun Jul 02 2023 02:12:50 GMT+0800 (China Standard Time)

If we have many machines, but different models, it is impossible for us to manually modify the configuration.

The executor will by default use the number of available cores for it's concurrent task slot configuration, so you shouldn't need to do any special configuration if you're okay with that default.

jokercurry · Answer 11 · Mon Jul 03 2023 10:37:59 GMT+0800 (China Standard Time)

#832 (comment)

What do you think of this proposal? @thinkharderdev

Dan Harris · Answer 12 · Mon Jul 03 2023 19:48:00 GMT+0800 (China Standard Time)

#832 (comment)

What do you think of this proposal? @thinkharderdev

Seems reasonable

jokercurry · Answer 13 · Tue Jul 04 2023 17:13:17 GMT+0800 (China Standard Time)

Could I raise a pr about it ? :-)

yahoNanJing · Answer 14 · Wed Jul 05 2023 15:48:43 GMT+0800 (China Standard Time)

If we have many machines, but different models, it is impossible for us to manually modify the configuration.

This part can be done at the executor start up script. In the script we can get the node cpu core if you want and then set the config. As for me, the cpu core number is a redundant part for the resource. And what's more, in some case, multiple executors may reside on one machine. Then it's not good to depend the node cpu core number for task assignment.

jokercurry · Answer 15 · Wed Jul 05 2023 19:38:38 GMT+0800 (China Standard Time)

This part can be done at the executor start up script

Good idea!

As for me, the cpu core number is a redundant part for the resource. And what's more, in some case, multiple executors may reside on one machine. Then it's not good to depend the node cpu core number for task assignment.

With multiple executors on one machine, this problem is indeed not solved. But for one machine and one executor, I understand that adding cpu cores will be more accurate.

yahoNanJing · Answer 16 · Thu Jul 06 2023 00:08:58 GMT+0800 (China Standard Time)

adding cpu cores will be more accurate

I don't think so. The executor slot is a virtual concept for describing how many tasks can be handled concurrently for an executor. It depends on the cpu core number. However, they are different. For some machine, it may be better to assign one tasks for a core. While for others, it may be better to assign two or three. It depends on whether the task is cpu bound or io bound. Therefore, here we should use a more general one to indicate the task concurrency.

Actually, on our production environment, we leverages the memory size rather than the cpu core number to indicate the task concurrency. The idea is similar to the one used by the Hadoop systems.

Dan Harris · Answer 17 · Thu Jul 06 2023 00:22:45 GMT+0800 (China Standard Time)

I agree with @yahoNanJing here. Having a default setting where task slots = cpu cores is a reasonable default, but different workloads have different constraints. Sometimes it is CPU, sometimes it is memory (eg if you do a lot of joins and high-cardinality aggregations) and could even be something like network bandwidth or disk size. Trying to "derive" the task slots from some static config values might just get confusing and complicated so seems like a more maintainable approach is to just have a sensible, easily explainable default and then let users configure different task concurrency based on their use case and whatever parameters they want to include. As mentioned this should be easily accomplished by having a script which picks the right task concurrency and then just runs the executor binary passing the correct value to the existing config.

jokercurry · Answer 18 · Thu Jul 06 2023 00:56:25 GMT+0800 (China Standard Time)

That's ok. I can understand now, and thank you very much for your answers, then I will close this issue first.