Spliterators for sorted collections cause inefficient stream operations when natural ordering is used
kilink opened this issue · comments
The Spliterators returned by sorted collections such as ImmutableSortedSet
and RegularImmutableAsList
result in unnecessary sorting occurring when a sort operation is performed on their associated Stream. The root cause is that the
Spliterators' getComparator()
implementations always return a Comparator
instance, even when natural ordering is used.
According to the Javadoc, null
should be returned by getComparator()
to indicate the source items are sorted in natural order.
You can see here that the SORTED
flag gets unset when a spliterator has the SORTED
characteristic but does not return a null
comparator. And here is where the missed optimization happens due to that.
The issue is trivially reproducible by doing the following, and stepping through the code in SortedOps where the expected no-op / optimization would ideally happen:
ImmutableSortedSet<Integer> sortedSet = ImmutableSortedSet.of(1, 2, 3, 4);
// sorted should be a no-op, but isn't because ImmutableSortedSet's Spliterator does not return a null comparator.
sortedSet.stream().sorted().collect(Collectors.toList());
Nice find, and thanks for the deep links.
It's of course generally impossible to know whether a Comparator
is actually implementing natural order, but we could at least special-case Comparator.naturalOrder()
. @lowasser any thoughts?
We should search for a JDK issue; given that it introduced that naturalOrder
comparator I think it should do this trick itself. I also think the Spliterator
javadoc could use a tweak so as not to imply the implementor is contractually bound to return null
.
Oh funny, I didn't bother mentioning it could special-case Ordering.natural()
too, because we hope users are moving off of that, but that might be what our spliterators are returning at the moment.
I agree that the JDK ought to handle this check itself, although it wouldn't help Guava at the moment due to Ordering.natural()
. I'm assuming the null
means natural ordering contract is a holdover from SortedMap
/ SortedSet
.