google / guava

Hi,

Following flaky tests were detected using the Nondex tool in ForwardingMapTest:

com.google.common.collect.ForwardingMapTest.testToStringWithNullValues
com.google.common.collect.ForwardingMapTest.testToStringWithNullKeys

The flakiness is due to comparing the toString() results of HashMaps (HashMap makes no guarantees as to the order of the map).

NonDex command:
edu.illinois:nondex-maven-plugin:1.1.3-SNAPSHOT:nondex -Dtest=com.twitter.graphjet.algorithms.salsa.SalsaTest#testSalsaWithRandomGraph

Error log:

Failed tests: 
  testToStringWithNullKeys(com.google.common.collect.ForwardingMapTest): expected:<{[foo=bar, null=baz]}> but was:<{[null=baz, foo=bar]}>
  testToStringWithNullValues(com.google.common.collect.ForwardingMapTest): expected:<{[foo=bar, baz=null]}> but was:<{[baz=null, foo=bar]}>

The fix is simply to use LinkedHashMap instead of HashMap - to make the order deterministic, for which I can raise a PR.

But, I am raising this issue first to discuss the approach for the fix and whether or not this fix is desirable as I understand that these test methods were written 8 years ago.

Those are tests of ForwardingMap, which uses (although indirectly) the delegate map's entry iterator. So these tests shouldn't be flaky unless HashMap.toString() returns something different when called twice on the same instance with no mutation, which I don't think it does.

And in any case, these are tests of ForwardingMap's toString() implementation, not simple usages of a map's string form to make assertions for an arbitrary test.

We haven't seen any flakiness in practice.

Please reopen if you think this is a mistake.

I agree with netdpb that there isn't a practical danger here worth fixing.

A clarification, though: This test is comparing toString() from two different HashMap instances (one of which is, as noted, wrapped in a ForwardingMap):

guava/guava-tests/test/com/google/common/collect/ForwardingMapTest.java

Lines 312 to 318 in 6d7e326

    
           hashmap.put("foo", "bar"); 
        
           hashmap.put("baz", null); 
        
           StandardImplForwardingMap<String, String> forwardingMap = 
        
               new StandardImplForwardingMap<>(Maps.<String, String>newHashMap()); 
        
           forwardingMap.put("foo", "bar"); 
        
           forwardingMap.put("baz", null);

That's at least arguably worse than comparing two calls to toString() on the same method. I mean, in theory, even calling toString() twice on the same HashMap isn't guaranteed to match :) But it currently does, and it's extremely likely to continue to. What we're doing in our test is a step up from that: We're building two HashMap instances in the same VM through identical series of method calls. In practice, that always matches, too, but one could imagine more HashMap implementations under which it doesn't.

The big danger we've seen over the years is when comparing a HashMap to some other kind of Map or at least to a HashMap that's built "in a different way." "In a different way" is a bit fuzzy to define, though. So, in practice, what we've found useful is to randomize the "step" size used when iterating over a HashMap for each VM during our tests. That way, two HashMap instances with identical bucket contents still iterate in the same order, but any difference beyond that will eventually come to our attention.

I don't know if that information is useful for a discussion with Nondex, but that's our experience so far.

In any case, thanks for the report: This did lead to me realize that our current randomization could be better for the case of multiple entries in the same hash bucket.

Thanks for the detailed feedback!

	hashmap.put("foo", "bar");
	hashmap.put("baz", null);

	StandardImplForwardingMap<String, String> forwardingMap =
	new StandardImplForwardingMap<>(Maps.<String, String>newHashMap());
	forwardingMap.put("foo", "bar");
	forwardingMap.put("baz", null);

Flaky tests in ForwardingMapTest