The NaN problem
shawnsmithdev opened this issue · comments
One of the defining characteristics of zermelo
is that it supports constraints.Float
types, likely a rarity for radix sort libraries given the bit-flipping and type-casting shenanigans you have to do to make it happen. Which means we need to deal with NaN
values.
NaN
s are lots of fun. By their own admission, they are not numbers. You can't sort them because you can't even compare them, much less consider their digits by radix. You can only have a policy on what to do with them when present.
zermelo
currently does a whole linear scan through the slice before any other action, looking for NaNs and setting them up front, ahead of all other elements. This only happens in the constraints.Float
code. It then does its flip-sort-flip magic on the remainder of the slice. This behavior was chosen as it is also how sort.Float64s()
handles NaN
s.
But comes now slices.Sort()
, the new generic comparison sort in golang.org/x/exp
. I fully expect constraints
and slices
and map
to show up in the stdlib, probably in go1.19 if nothing goes wrong. We shall see, but that is my assumption. I do know it is much faster than the sort
package, likely due to less function pointer dereferencing and more inlining, and probably also just a better comparison sort implementation.
I'm not quite sure if there is a defined behavior at all for NaN
s in slices.Sort
, but it seems unlikely given is a generic comparison sort on constraints.Ordered
. What I am sure of is that it isn't the same as sort
.
Which is all a long way of saying we need an official policy, documented and tested, about what is done with NaN
s.
I've decided to continue to put NaNs up front as that was the behavior before, it is fast and linear speed. v1.5.2 now includes tests for this. I will want to mention it in the actual README later when I do a better update to highlight the generics code.
This is basically resolved, improving documentation will probably happen in the 2.0 release and backported later