make_slice() appears to be copying data
vigna opened this issue · comments
I'm trying to use the Parlay integer sort to replace Boost's block_indirect_sort()
. Where I would use
boost::sort::block_indirect_sort(p, p + points);
where p
is the pointer to the data and points
the data size, I'm suing
parlay::integer_sort_inplace(parlay::make_slice(p, p + points));
which seems to me the right thing after looking at the benchmark code (there are no examples or documentation). The problem is that make_slice()
is copying the data, doubling the memory usage. What am I doing wrong?
WHAT?!? 😂
I spent two hours chasing your code to understand how you could do radix sort in place without following cycles. I even debated the pro and cons of your in place sort vs. block indirect sort... and you're playing with words?
"In place" has a precise meaning. You don't use additional memory (or at least very limited additional memory): https://en.wikipedia.org/wiki/In-place_algorithm
Where would be a pointer in the documentation to the fact that "in place" for you mean something completely different?
Well, I guess at this point your "stable" algorithms are... well, something. Not sure they're stable. 🤷🏻♂️
The algorithm isn't actually as bad as copying the input. It never explicitly copies any data, which can be seen by the fact that it happily will sort uncopyable types, e.g.,
auto s = parlay::tabulate(1000000, [](size_t i) {
return std::make_unique<unsigned>(1000000-i);
});
parlay::integer_sort_inplace(s, [](auto&& p) { return *p; });
Its not memory efficient because it does use a temporary buffer, which elements from the input are swapped between. The main issue is that making it use zero extra memory is difficult without hurting the performance, and most of Parlay's users would complain if the performance got worse (substantially more than would complain that it uses more memory).
I'm not asking to change the implementation—I'm just asking to stop the cheating and the lies.
But I see this is not gonna happen.