google / guava

Google core libraries for Java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spliterators for sorted collections cause inefficient stream operations when natural ordering is used

kilink opened this issue · comments

The Spliterators returned by sorted collections such as ImmutableSortedSet and RegularImmutableAsList result in unnecessary sorting occurring when a sort operation is performed on their associated Stream. The root cause is that the
Spliterators' getComparator() implementations always return a Comparator instance, even when natural ordering is used.

According to the Javadoc, null should be returned by getComparator() to indicate the source items are sorted in natural order.

You can see here that the SORTED flag gets unset when a spliterator has the SORTED characteristic but does not return a null comparator. And here is where the missed optimization happens due to that.

The issue is trivially reproducible by doing the following, and stepping through the code in SortedOps where the expected no-op / optimization would ideally happen:

ImmutableSortedSet<Integer> sortedSet = ImmutableSortedSet.of(1, 2, 3, 4);
// sorted should be a no-op, but isn't because ImmutableSortedSet's Spliterator does not return a null comparator.
sortedSet.stream().sorted().collect(Collectors.toList());

Nice find, and thanks for the deep links.

It's of course generally impossible to know whether a Comparator is actually implementing natural order, but we could at least special-case Comparator.naturalOrder(). @lowasser any thoughts?

We should search for a JDK issue; given that it introduced that naturalOrder comparator I think it should do this trick itself. I also think the Spliterator javadoc could use a tweak so as not to imply the implementor is contractually bound to return null.

Oh funny, I didn't bother mentioning it could special-case Ordering.natural() too, because we hope users are moving off of that, but that might be what our spliterators are returning at the moment.

I agree that the JDK ought to handle this check itself, although it wouldn't help Guava at the moment due to Ordering.natural(). I'm assuming the null means natural ordering contract is a holdover from SortedMap / SortedSet.