palacaze / sigslot

A simple C++14 signal-slots implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Arguments are copied unnecessarily

ryb-ableton opened this issue · comments

Thanks for the great library!

I recently noticed that value arguments (e.g., signal<std::string>, not signal<const std::string&>) are not passed optimally with regards to moves/copies. As shown in the test below, there are some unnecessary copies. If CopyCounter were replaced with a large std::vector, this could be a performance problem.

struct CopyCounter
{
    CopyCounter() = default;
    CopyCounter(CopyCounter&&) = default;
    CopyCounter& operator=(CopyCounter&&) = default;

    CopyCounter(const CopyCounter&)
    {
        ++numCopies;
    }

    CopyCounter& operator=(const CopyCounter&)
    {
        ++numCopies;
        return *this;
    }

    static int numCopies;
};
int CopyCounter::numCopies = 0;

void test_argument_copies_for_lvalue() {
    CopyCounter::numCopies = 0;

    sigslot::signal<CopyCounter> sig;
    sig.connect([] (CopyCounter) {});

    CopyCounter c;
    sig(c);

    // Actual
    assert(CopyCounter::numCopies == 2);
    // Expected
    // assert(CopyCounter::numCopies == 1);
}

void test_argument_copies_for_rvalue() {
    CopyCounter::numCopies = 0;

    sigslot::signal<CopyCounter> sig;
    sig.connect([] (CopyCounter) {});

    sig(CopyCounter{});
    // Actual:
    assert(CopyCounter::numCopies == 2);
    // Expected:
    // assert(CopyCounter::numCopies == 0);
}

I believe one fix would be to use std::forward for arguments in signal_base::operator(), to only std::forward arguments for the last slot in slot_base::operator(), and to std::forward (not std::move) arguments in slot*::call_slot. A stack overflow answer has some information about only forwarding the last slot's args.

It would be possible to go further and also eliminate copies for non-movable types, but may not be worth the complexity.

Thanks for the detailed report.

I agree that the current implementation is not optimal with respect to arguments forwarding.

The first part of your suggested fix is simple to apply, and will avoid one copy for rvalues. The second part is considerably trickier, as it involves forwarding arguments to a virtual method (call_slots).

I will look into it.

So, I had an idea for a quick-fix by performing a conditional move depending on the type of each argument to call_slot().
I pushed a proof of concept on the perf branch.

It appears to pass all the unit tests but I will a to test some corner cases. Copy-only and rvalue reference arguments come to mind.

Do tell me if this is what you had in mind.

@palacaze Sorry for the slow answer. That's what I had in mind!

Some feedback on the perf branch:

  • I believe maybe_move is equivalent to std::forward. Perfect forwarding is the typical use case for std::forward, but it can also be used as a conditional move for cases like this.
  • Instead of summing group sizes in signal_base::operator(), did you think about using iterators for the loops (instead of range-for loops) and a per-iteration check for the last element? That might be faster when using many groups (would be interesting to benchmark).
  • More tests would definitely be a good idea. Especially a test that an rvalue emit argument is only moved once if there are multiple slots. You might count copy/move assignment in addition to construction to be complete.

Would you prefer to implement this yourself or take a PR from me? I'm on vacation next week but could potentially work on it after.

@palacaze Sorry for the slow answer. That's what I had in mind!

Some feedback on the perf branch:

* I believe `maybe_move` is equivalent to `std::forward`. Perfect forwarding is the typical use case for `std::forward`, but it can also be used as a conditional move for cases like this.

Yes, this seems to give the same result. maybe_move uses explicit casting whereas std::forward relies on reference collapsing. Using std::forward crossed my mind but this is not a customary use for it, and I worried it would not convey the semantics correctly.
Anyway, this is hardly a problem, so I will use std::forward instead.

* Instead of summing group sizes in `signal_base::operator()`, did you think about using iterators for the loops (instead of range-for loops) and a per-iteration check for the last element? That might be faster when using many groups (would be interesting to benchmark).

Two trade-offs for groups right now to minimize connection and disconnection costs is to keep an ordered vector or groups, and also to keep empty groups. This was the easy to implement and signal emission was not really impacted. Consequently, I cannot rely on iteration because the last group may very well be empty.

Another possible implementation may be to keep a single vector with all the slots, weakly ordered by group_id. Connections and disconnections will be more expensive, but signal emission will be more straightforward and maybe cheaper. The reason why I avoided this design is to avoid pulling the algorithm header. Maybe I should revisit this design choice.

* More tests would definitely be a good idea. Especially a test that an rvalue emit argument is only moved once if there are multiple slots. You might count copy/move assignment in addition to construction to be complete.

Would you prefer to implement this yourself or take a PR from me? I'm on vacation next week but could potentially work on it after.

I forgot to reply to your question.

This is my last day of vacation and I may not have a lot of time to work on this in the next few weeks. So, yes, I would not happily accept a PR.

One thing I want to do soon is to setup a CI, because this is becoming tricky to test for regressions on multiple platform and OS combinations.

So, yes, I would not happily accept a PR.

You mean you'd accept a PR, right? 😄

Yes, sorry...

@palacaze Wondering if you had any luck testing the edge cases on the branch perf. My team and I would greatly benefit from move semantics in this library.

As would I, thanks for the reminder!