automerge / automerge-swift

Swift language bindings presenting Automerge

Home Page:https://automerge.org/automerge-swift/documentation/automerge/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

resolve potentially flaky test

heckj opened this issue · comments

When updating documentation with #81, the post-merge build reported a failure on tests.

The relevant failure:

Test Case '-[AutomergeTests.PatchesTestCase testReceiveSyncMessageWithPatches]' started.
/Users/runner/work/automerge-swift/automerge-swift/Tests/AutomergeTests/TestPatches.swift:43: error: -[AutomergeTests.PatchesTestCase testReceiveSyncMessageWithPatches] : XCTAssertEqual failed: ("[]") is not equal to ("[Automerge.Patch(action: Automerge.PatchAction.Put(ObjId.ROOT, Automerge.Prop.Key("key2"), ScalarValue<String(value2)>), path: [])]")

I verified locally again that this test wasn't causing a failure with the current main branch, but I think it's worth digging deeper to see if this is a flaky test or perhaps some other issue (my system is more updated than CI, so there could be a lingering issue there?)

Just hit this one again in CI, investigating further

Wrapped the potentially flaky test in a loop - at 100 iterations, it doesn't always hit it, but frequently does, so I increased the loop to 1000 iterations, and that gets it. Looks like the failure rate there is around 0.9% from my manual investigation.

Looks like it's definitely some sort of race condition.

After the two documents are in sync, the code is using the core library very directly, with serialization enforced by calls within a Serial dispatch queue.

cc @alexjg - Since sync was something that's been recently dug into, wanted to give you a head's up on this issue. Intermittent flakiness - but I'm not clear on where the issue is arising, since most of the heavy lifting here in the core Rust library.

The test (https://github.com/automerge/automerge-swift/blob/flake_test_hunt/Tests/AutomergeTests/TestPatches.swift#L32-L64 ) creates two documents, forces a sync (and I verified that with checking the serialized versions), then modifies one and calls receiveSyncMessageWithPatches on the other to verify the patches are getting correctly generated. In 1 of approx. 100 iterations (it's varying around there), it's returning an empty set of patches instead of the single, expected patch.

I added in some recent test code to make sure the documents were actually in sync before invoking the edit & patch mechanisms.

For debugging purposes, is there any way to "reasonably" decode the sync messages? At the moment, it's a bit opaque and I'm not sure if the issue is in the call to generate the sync message, or the call to receive the generated sync message. Any suggestions? Or doesn't anything else spring to mind?

The updated iterative test code and verification is in branch flake_test_hunt (https://github.com/automerge/automerge-swift/tree/flake_test_hunt)

I did go ahead and splat out the sync messages, and it looks like the issue could be there - I can't decode it, and they're not consistent because I'm creating new documents with each iteration, but in the failure scenarios the sync messages are notably shorter:

Checking iteration 946
  Sync Msg: 174 bytes: 4201031d3825677cae0b24bc52f5c1f916657966e299bdbe6f685febb498311f69e70001016a786e341f2175b8b64eaa94516acc22e6ab3e0ccd408b3799a637ea76de9fa505010a07b8460161856f4a83031d38250157016a786e341f2175b8b64eaa94516acc22e6ab3e0ccd408b3799a637ea76de9fa5107f09cd80067541cb93910506457910920102000000061506340142025602570670027f046b657932017f017f6676616c7565327f00

versus:

Checking iteration 947
  Sync Msg: 76 bytes: 4201180a072590655fb39085ce86e2fca80de391c46dc6c547ffb02c9a9db807dff9000101ad5f0d0deb94803cf74c85ffeac776b9b2f3683799e94396f125e956cba3d61b05010a07000100
/Users/heckj/src/automerge-swift/Tests/AutomergeTests/TestPatches.swift:66: error: -[AutomergeTests.PatchesTestCase testPatchesInLoop] : XCTAssertEqual failed: ("[]") is not equal to ("[Automerge.Patch(action: Automerge.PatchAction.Put(ObjId.ROOT, Automerge.Prop.Key("key2"), ScalarValue<String(value2)>), path: [])]")

In each error instance, the sync message was 76 bytes instead of the otherwise consistent 174 bytes.