xiaodaigh / SortingLab.jl

Faster sorting algorithms (sort and sortperm) for Julia

Repository from Github https://github.comxiaodaigh/SortingLab.jlRepository from Github https://github.comxiaodaigh/SortingLab.jl

fsortperm still much slower than fsort

ParadaCarleton opened this issue · comments

julia> @btime fsortperm($x)
  1.210 s (18 allocations: 514.00 MiB)
julia> @btime fsort($x)
  486.066 ms (18 allocations: 256.19 MiB)

Here's what's bizarre -- in R, the pattern is almost exactly reversed. order takes about .651 seconds vs 1.01 for sort.

what's your R code? returning order should be slower based on my understanding

what's your R code? returning order should be slower based on my understanding

The R code is just order(rexp(2^24)). Apparently, though, the two take the same amount of time, so long as they use the same settings. The two methods should theoretically take almost exactly the same amount of time, since any sort can be transformed into a sortperm by sorting an array of tuples array[i], i and then using collect(second.) (which should take way less time than the sort).

there is an undocumented sorttwo! function that can sort the primary and an index at the same time.

there is an undocumented sorttwo! function that can sort the primary and an index at the same time.

I seem to be getting an error --

julia> using SortingLab

help?> sorttwo!
search:

Couldn't find sorttwo!
Perhaps you meant sort! or sort
  No documentation found.

  Binding sorttwo! does not exist.

Ahh, I see, it's not exported. Thanks.

As for R, it's worth noting that "Same speed" is being slightly liberal -- the difference is smaller, but it's still there.

> microbenchmark::microbenchmark(order(x))
Unit: milliseconds
     expr      min       lq     mean  median      uq      max neval
 order(x) 589.2389 610.3625 615.4213 613.978 624.377 642.1687   100
> ?sort
> microbenchmark::microbenchmark(sort(x,na.last=TRUE))
Unit: milliseconds
                    expr      min       lq     mean   median       uq      max neval
 sort(x, na.last = TRUE) 784.3452 799.5148 850.2098 821.1592 847.6561 1504.099   100

I don't know how R does it. But for sorting 10 Float32 on my machine fsort and fsortperm are both faster than R's sort and order.

I guess unless you can tell me WHY R has faster order vs sort, there's no I can do.So I will close it.

array[i], i and then using collect(second.)

This is exactly what sorttwo! does and is the algorithm behind fsortperm

I don't know how R does it. But for sorting 10 Float32 on my machine fsort and fsortperm are both faster than R's sort and order.

I guess unless you can tell me WHY R has faster order vs sort, there's no I can do.So I will close it.

array[i], i and then using collect(second.)

This is exactly what sorttwo! does and is the algorithm behind fsortperm

R's order is faster than its sort because sort calls order internally -- it builds the index for the array, then indexes into it.

I'm not sure why R is faster on my computer, but not on yours. sorttwo! takes about a second, still significantly longer than order.