Thread-per-core Architecture

Question

Thread-per-core Architecture

kosav7r opened this issue 4 months ago · comments

kosav7r commented 4 months ago

Hi folks,

I'm in the process of building a storage system and evaluating PhotonLib. Same architectural decisions on the system:

Thread per core, no context switching is desired
Share nothing; each thread will allocate resources including memory, network(sockets). The goal is to eliminate synchronization and improve cache efficiency, this is absolutely very important.
Interrupt Affinity

Do you have any recommendations on thread-to-thread communication without sharing any memory?

Coroutine-based approach is new to me. Can I achieve my basic needs on PhotonLib? If so, are there any examples?

Thanks!

kosav7r · Answer 1 · Wed Mar 06 2024 08:39:11 GMT+0800 (China Standard Time)

Kindly pinging :)

Bob Chen · Answer 2 · Thu Mar 07 2024 13:15:16 GMT+0800 (China Standard Time)

Yes, Photon coroutines can satisfy your requirements. Every resource is located in a single thread and shared among coroutines. You can read the documents for more details.

Huiba Li · Answer 3 · Fri Mar 08 2024 09:57:18 GMT+0800 (China Standard Time)

Linux thread is referred to as vCPU in Photon. Each of them has a dedicated scheduler for coroutines (Photon's threads), and a dedicated instance of event engine (e.g. epoll or io_uring). Their execution is basically independent of each other, unless you conduct inter-vCPU task coordination or migration.

Huiba Li · Answer 4 · Fri Mar 08 2024 10:01:37 GMT+0800 (China Standard Time)

You can realize interrupt affinity in the same way you do to other applications, e.g. pinning interrupt handler and corresponding photon vCPU (Linux thread) to the same physical CPU core.

kosav7r · Answer 5 · Fri Mar 08 2024 11:33:51 GMT+0800 (China Standard Time)

What would you recommend to separate CPU and IO-bound jobs in the programming models? As far as I know, coroutines are for IO-bound jobs.

Bob Chen · Answer 6 · Fri Mar 08 2024 12:16:57 GMT+0800 (China Standard Time)

You can use the migrate API to move your CPU bound tasks to specific vCPU. It’s lightweight

kosav7r · Answer 7 · Sun Mar 10 2024 13:56:48 GMT+0800 (China Standard Time)

What would you recommend for Interprocessor communication if shared memory is absolutely no?

Bob Chen · Answer 8 · Sun Mar 10 2024 16:10:20 GMT+0800 (China Standard Time)

Is it a Photon related issue?

kosav7r · Answer 9 · Sun Mar 10 2024 16:17:37 GMT+0800 (China Standard Time)

Already appreciate your answers so I apologize if it sounds unrelated. I am evaluating and comparing Photon with Seastar. Trying to map approaches in Seastar to Photon.

For example, In Seastar, it is mostly done by passing a lambda to a neighbor VCPU. I was wondering what do you think is a best approach to take as communication between VCPUs.

Bob Chen · Answer 10 · Sun Mar 10 2024 16:34:05 GMT+0800 (China Standard Time)

A Photon thread (coroutine) is essentially a function. Lambda is also the same.

The underlay implementation of thread migrate is eventfd notification and task queue.

Besides Photon also has a MPMC queue to transmit functions, encapsulated as the so-called WorkPool

loongs-zhang · Answer 11 · Thu Mar 14 2024 18:01:17 GMT+0800 (China Standard Time)

I've searched the code, how about use sched_setaffinity(linux)/thread_policy_set(macos) to bind vCPU to a single CPU core? @beef9999

Huiba Li · Answer 12 · Fri Mar 15 2024 12:06:30 GMT+0800 (China Standard Time)

@loongs-zhang Are you suggesting that we bind vCPU by default?

Huiba Li · Answer 13 · Fri Mar 15 2024 12:13:41 GMT+0800 (China Standard Time)

What would you recommend for Interprocessor communication if shared memory is absolutely no?

Multi-process without sharing memory? How about UNIX domain socket?

loongs-zhang · Answer 14 · Fri Mar 15 2024 13:57:28 GMT+0800 (China Standard Time)

@loongs-zhang Are you suggesting that we bind vCPU by default?

yes

loongs-zhang · Answer 15 · Fri Mar 15 2024 15:03:53 GMT+0800 (China Standard Time)

What would you recommend for Interprocessor communication if shared memory is absolutely no?
Multi-process without sharing memory? How about UNIX domain socket?

How about deep cloning and sharing?

Huiba Li · Answer 16 · Fri Mar 15 2024 17:48:14 GMT+0800 (China Standard Time)

@loongs-zhang Are you suggesting that we bind vCPU by default?

yes

As different apps require different binding configuration, it's difficult for us to do it by default.
For example, a typical scenario is file/storage server. We may need to consider IRQ handlers of the NICs and SSDs, and our service threads (vCPUs). The best binding configuration should minimize CPU switching along the execution.

Huiba Li · Answer 17 · Fri Mar 15 2024 17:53:51 GMT+0800 (China Standard Time)

How about deep cloning and sharing?

I not sure whether cloning is feasible, as it may imply sharing in the first place, and @kosav7r said it was "absolutely no".

Huiba Li · Answer 18 · Thu Mar 28 2024 10:21:03 GMT+0800 (China Standard Time)

What would you recommend to separate CPU and IO-bound jobs in the programming models? As far as I know, coroutines are for IO-bound jobs.

Photon has a built-in WorkPool to deal with various kinds of background jobs. For IO-bound ones, you can initialize the worker vCPUs to enable coroutines and event engines. For CPU-bound ones, you can simple use kernel threads without initializing photon.

BTW, the jobs are efficiently passed to workers with lock-free shared memory ring queue.