alibaba / PhotonLibOS

Probably the fastest coroutine lib in the world!

Home Page:https://PhotonLibOS.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

If there is only one connection, what's the best practice of taking full advantage of multiple cores

jiangdongzi opened this issue · comments

If there is only one connection, what's the best practice of taking full advantage of multiple cores
  1. Use thread_migrate or WorkPool async_call to dispatch your tasks to different vCPU, waiting tasks to be done.
  2. Either switch back to the previous thread that serves the connection, or write back response directly in the current thread, with locks.
  1. Use thread_migrate or WorkPool async_call to dispatch your tasks to different vCPU, waiting tasks to be done.
  2. Either switch back to the previous thread that serves the connection, or write back response directly in the current thread, with locks.

I read the souce code and find Photon just use one queue among the pool std::threads, the competition may be very fierce.
Although photon use lock free method, but the atomic value may change very frequently, thus making cpu cache miss frequently. I have tested mutithread operating one atomic value, the cost is too expensive, ~200 cycles. I think work stealing is a better method, like golang.

There had been many discussions on work stealing, but the fact is that it's still under development and at early stage. It'll be appreciated if you can contribute.

I think work stealing is a better method, like golang.

thread_migrate is as efficient as work-stealing. You may migrate one to a random vCPU in a pool.

thread_migrate is as efficient as work-stealing.

No, it should be more efficient than work-stealing. In this mode, lock contention occurs between the dispatching vCPU and each worker vCPU respectively. Whereas in work-stealing, lock contention occurs among all of the vCPU together, in order for the workers to steal from the run queue of dispatching vCPU.

That is to say, work-stealing should perform similar to a lock-free ring queue in this scenario. They both have a queue shared and competed by all vCPU.