Performance Issues

Question

Performance Issues

withinboredom opened this issue 4 months ago · comments

I'm attempting to work on an async client to nats-php and note that the performance is pretty bad (basis-company/nats.php#62) and I've narrowed it down to the implementation of Queue and Pipelines. In fact, the more steps? is on a pipeline, the worse performance it appears to get:

Before adding ->concurrent()

After:

Without pipeline:

(async-background is a pipeline driven flow)

Should this be expected or is there some way to improve the performance?

Aaron Piotrowski · Answer 1 · Sat Feb 17 2024 05:54:21 GMT+0800 (China Standard Time)

Queue implements back-pressure, pausing the current fiber when calling Queue::push() until the current value is consumed. This can introduce some latency and might be leading to the poor performance numbers.

You can avoid this in two ways:

Allow Queue to buffer some number of values before Queue::push() waits for a value to be consumed before continuing. This size is passed to the Queue constructor.
Use Queue::pushAsync(), which returns a Future which is resolved when the value is consumed, but does not suspend the current fiber.

If you don't care about when the value is consumed, consider the first option, passing some very large integer. This will avoid even creating the Future because there's nothing to await.

Give those a try and let me know if you have any luck.

Rob Landers · Answer 2 · Sat Feb 17 2024 17:03:53 GMT+0800 (China Standard Time)

I tried (1) but now that I think about it, that's probably the issue since (IIRC, about to hop on a plane), each operation on the pipeline uses its own queue which creates back pressure. Perhaps that's what I'm seeing?

Rob Landers · Answer 3 · Thu Feb 29 2024 23:35:22 GMT+0800 (China Standard Time)

@trowski that is indeed the issue, but it's out of my hands. Each step in the pipeline creates its own queue with a default of 0 buffer. This is a fairly significant performance impact when there is a lot of data in the pipeline.

Aaron Piotrowski · Answer 4 · Fri Mar 01 2024 09:20:30 GMT+0800 (China Standard Time)

I pushed a branch, buffer, which adds Pipeline::buffer(int $bufferSize). Please use that branch as your dependency and call that method immediately after creating the pipeline with some various buffer sizes. Let me know what effect that has on performance.

Niklas Keller · Answer 5 · Tue Mar 19 2024 18:52:31 GMT+0800 (China Standard Time)

This has now been released in 1.2.0.

Bilge · Answer 6 · Thu Mar 21 2024 18:20:43 GMT+0800 (China Standard Time)

But @withinboredom never confirmed if the new method was actually useful?

Rob Landers · Answer 7 · Thu Mar 21 2024 23:06:54 GMT+0800 (China Standard Time)

Ah, it's useful @Bilge! I just never came back with perf improvements because I'm in the middle of a huge refactor. FWIW, it is magnitudes faster, I just don't have a way to get hard numbers.