tidwall / neco

Concurrency library for C (coroutines)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Slow creation of coroutines?

deckarep opened this issue · comments

Hello,

This library is amazing! Neco is well documented and well put together so thank you!

But I wanted to ask about slow coroutine creation? I'm building an app that is doing some analysis and theoretical limits of of this library's upper bound on coroutines.

Specifically, i'm noticing that I can safely spawn 10's of thousands of coroutines and have them do useful work and they are performant for my use case but, the actual spawn process (the creating of the coroutines) I'm realizing is fairly slow compared to everything else. It makes the application startup very slow as a result.

My question is: do you have any suggestions for how to optimize for this as a user of this library? One theory I had though was that maybe your api could have a way to specify a capacity hint if perhaps the problem is a lot of memory churn on continually creating each coroutine.

The func under question is: neco_start and my application simply creates many in a loop at startup.

Any direction you can offer is appreciated.

One observation I had:

I'm realizing that since all my code is single threaded, as I'm creating coroutines they start doing work immediately so my startup is quite slow likely due to it having to share more and more cpu time with coroutines that are now doing work. This makes sense actually.

So I was able to see a considerable startup speed by immediately having my coroutines sleep for some time. Now my startup is much faster...but the coroutines area all on standby until some sleep timeout occurs.

So my follow up question: is there a way to do the creation of coroutines and start them only after they've all been created?

Hi.

I'm realizing that since all my code is single threaded, as I'm creating coroutines they start doing work immediately so my startup is quite slow likely due to it having to share more and more cpu time with coroutines that are now doing work. This makes sense actually.

I think your observation is on point.

is there a way to do the creation of coroutines and start them only after they've all been created?

Yes. Perhaps using waitgroup will suffice.

Below I'm starting 100k coroutines all at once and waiting for them all as a group to start.

It takes about 400 ms on my machine.

#include "neco.h"

void coro(int argc, void *argv[]) {
    int idx = *(int*)argv[0];
    neco_waitgroup *wg = argv[1];
    neco_waitgroup_done(wg);
    neco_waitgroup_wait(wg);

    // ... time to work ... //
}

int neco_main(int argc, char *argv[]) {
    int N = 100000;

    // Use a waitgroup to wait for the child coroutines to initialize.
    neco_waitgroup wg;
    neco_waitgroup_init(&wg);
    neco_waitgroup_add(&wg, N);

    // Start coroutines
    int64_t start = neco_now();
    for (int i = 0; i < N; i++) {
        neco_start(coro, 2, &i, &wg);
    }

    // Wait for all coroutines to start
    neco_waitgroup_wait(&wg);

    int64_t elapsed = neco_now()-start;
    printf("all started in %.0f ms\n", (double)elapsed/1e6);

    return 0;
}

@tidwall - big thank you for this help. I think this will work actually. I was trying to pull off something similar with the neco_cond type but I think this will be possibly more straightforward.

You're welcome.
I initially considered using a neco_cond too but decided a waitgroup probably makes more sense in this case. Either one will probably perform about the same.