realm / realm-core

Core database component for the Realm Mobile Database SDKs

Home Page:https://realm.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

.Net Assertion failed: !realm.is_in_transaction() in test suite

birdalicious opened this issue · comments

SDK and version

SDK : .Net 8

This is a follow up to #7347 I got busy with other things but now I have a repro

Observations

  • I'm not sure how to enable the backtrace
  • This occurs when I run my test suite which uses an InMemoryConfiguration for the realms, but I cannot deterministically get this to occur. Some tests runs this never happens.
  • Two different crash messages that occur independently

Crash log / stacktrace

  1. The active test run was aborted. Reason: Test host process crashed : D:\a\realm-dotnet\realm-dotnet\wrappers\realm-core\src\realm\object-store\impl\realm_coordinator.cpp:1160: [realm-core-13.23.4] Assertion failed: !realm.is_in_transaction()
  2. The active test run was aborted. Reason: Test host process crashed : Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
    16:41:37:375 Repeat 2 times:
    16:41:37:375 --------------------------------
    16:41:37:375 at Realms.SynchronizationContextScheduler.scheduler_invoke_function(IntPtr, Boolean)
    16:41:37:375 --------------------------------
    16:41:37:375 at Realms.SynchronizationContextScheduler+Scheduler.b__5_0(System.Object)
    16:41:37:375 at Xunit.Sdk.AsyncTestSyncContext+<>c__DisplayClass7_0.b__1(System.Object)
    16:41:37:375 at Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext(System.Threading.SendOrPostCallback, System.Object)
    16:41:37:375 at Xunit.Sdk.MaxConcurrencySyncContext+<>c__DisplayClass11_0.b__0(System.Object)
    16:41:37:375 at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
    16:41:37:375 at Xunit.Sdk.ExecutionContextHelper.Run(System.Object, System.Action`1<System.Object>)
    16:41:37:375 at Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc()
    16:41:37:375 at Xunit.Sdk.XunitWorkerThread+<>c.b__5_0(System.Object)
    16:41:37:375 at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
    16:41:37:375 at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)

Steps & Code to Reproduce

using Realms;

namespace realms_failure
{
    public class UnitTest1
    {
        private readonly RealmCache _cache;
        private readonly RealmConfigurationBase _configuration;

        public UnitTest1()
        {
            _configuration = new InMemoryConfiguration(Guid.NewGuid().ToString());
            _cache = new RealmCache(_configuration);
        }

        [Fact]
        public async Task Test1()
        {
            await _cache.RegisterItem(Guid.NewGuid(), "hello");
        }


        [Fact]
        public async Task Test2()
        {
            await _cache.RegisterItem(Guid.NewGuid(), "hello");
        }
    }

    public class RealmCache(RealmConfigurationBase realmConfig)
    {
        public async Task RegisterItem(Guid messageId, string message)
        {
            using var realm = await Realm.GetInstanceAsync(realmConfig);

            realm.Write(() =>
            {
                var entry = realm.Find<CacheEntry>(messageId);

                if (entry is null)
                {
                    entry = new CacheEntry
                    {
                        MessageId = messageId,
                        Message = message,
                    };
                }

                realm.Add(entry);
            });
        }
    }

    public partial class CacheEntry : IRealmObject
    {
        [PrimaryKey]
        public Guid MessageId { get; set; }
        public string Message { get; set; } = "";
    }
}

Run the two tests together and occasionally they with throw this error in the test output. You can also select 'Run until failure' which will continually run the tests and you can watch the test output

➤ PM Bot commented:

Jira ticket: RCORE-2084

@nirinchev could you take a look? The basic structure of the repro seems fine to me - it should work. Could this be xunit issue with async as you've mentioned under #7347?

I am unable to repro this - either on my mac or on windows. Here's the project I'm running:

Core_7595.zip

Perhaps this only occurs if memory usage is close to 100%, because my laptop is regularly at 95%+

I should note that I have tried to run your solution with not changes except adding visual studio test runner

<PackageReference Include="xunit.runner.visualstudio" Version="2.4.5">
	<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
	<PrivateAssets>all</PrivateAssets>
</PackageReference>

And I did get

The active test run was aborted. Reason: Test host process crashed : Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
15:42:27:069	Repeat 2 times:
15:42:27:069	--------------------------------
15:42:27:069	   at Realms.SynchronizationContextScheduler.scheduler_invoke_function(IntPtr, Boolean)
15:42:27:069	--------------------------------
15:42:27:069	   at Realms.SynchronizationContextScheduler+Scheduler.<Post>b__5_0(System.Object)
15:42:27:069	   at Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext(System.Threading.SendOrPostCallback, System.Object)
15:42:27:069	   at Xunit.Sdk.MaxConcurrencySyncContext+<>c__DisplayClass11_0.<WorkerThreadProc>b__0(System.Object)
15:42:27:069	   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
15:42:27:069	   at Xunit.Sdk.ExecutionContextHelper.Run(System.Object, System.Action`1<System.Object>)
15:42:27:069	   at Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc()
15:42:27:069	   at Xunit.Sdk.XunitWorkerThread+<>c.<QueueUserWorkItem>b__5_0(System.Object)
15:42:27:069	   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
15:42:27:069	   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)

And

15:48:42:989	Building Test Projects
15:48:43:052	========== Starting test run ==========
15:48:43:878	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:44:377	[xUnit.net 00:00:00.50]   Starting:    Core_7595
15:48:44:465	[xUnit.net 00:00:00.59]   Finished:    Core_7595
15:48:44:759	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:44:963	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:45:055	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:45:330	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:45:544	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:45:625	[xUnit.net 00:00:00.30]   Finished:    Core_7595
15:48:45:951	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:46:164	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:46:241	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:46:628	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:46:833	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:46:911	[xUnit.net 00:00:00.28]   Finished:    Core_7595
15:48:47:208	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:47:414	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:47:495	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:47:776	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:47:982	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:48:065	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:48:346	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:48:569	[xUnit.net 00:00:00.22]   Starting:    Core_7595
15:48:48:663	[xUnit.net 00:00:00.31]   Finished:    Core_7595
15:48:48:974	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:49:189	[xUnit.net 00:00:00.22]   Starting:    Core_7595
15:48:49:280	[xUnit.net 00:00:00.31]   Finished:    Core_7595
15:48:49:558	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:49:769	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:49:843	[xUnit.net 00:00:00.28]   Finished:    Core_7595
15:48:50:219	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:50:425	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:50:507	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:50:817	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:51:019	[xUnit.net 00:00:00.20]   Starting:    Core_7595
15:48:51:097	[xUnit.net 00:00:00.28]   Finished:    Core_7595
15:48:51:407	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:51:616	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:51:695	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:51:966	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:52:164	[xUnit.net 00:00:00.20]   Starting:    Core_7595
15:48:52:243	[xUnit.net 00:00:00.28]   Finished:    Core_7595
15:48:52:515	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:52:732	[xUnit.net 00:00:00.22]   Starting:    Core_7595
15:48:52:806	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:53:146	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:53:354	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:53:432	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:53:708	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:53:918	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:54:004	[xUnit.net 00:00:00.30]   Finished:    Core_7595
15:48:54:311	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:54:523	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:54:618	[xUnit.net 00:00:00.31]   Finished:    Core_7595
15:48:54:950	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:55:159	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:55:239	[xUnit.net 00:00:00.29]   Finished:    Core_7595
15:48:55:542	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:55:775	[xUnit.net 00:00:00.22]   Starting:    Core_7595
15:48:55:838	[xUnit.net 00:00:00.30]   Finished:    Core_7595
15:48:56:160	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:56:367	[xUnit.net 00:00:00.21]   Starting:    Core_7595
15:48:56:444	[xUnit.net 00:00:00.28]   Finished:    Core_7595
15:48:56:727	[xUnit.net 00:00:00.00] xUnit.net VSTest Adapter v2.4.5+1caef2f33e (64-bit .NET 8.0.4)
15:48:56:930	[xUnit.net 00:00:00.20]   Starting:    Core_7595
15:49:00:365	The active test run was aborted. Reason: Test host process crashed : D:\a\realm-dotnet\realm-dotnet\wrappers\realm-core\src\realm\object-store\impl\realm_coordinator.cpp:1165: [realm-core-14.5.1] Assertion failed: !realm.is_in_transaction()
15:49:00:365	<backtrace not supported on this platform>
15:49:00:365	!!! IMPORTANT: Please report this at https://github.com/realm/realm-core/issues/new/choose
15:49:00:365	

When repeatedly running the tests using 'run until failure'. As you can see it is not every time and not immediately but it does occur for me using the solution you provided *with VS test runner added

I also lowered my memory usage to 79% and this still occured.

@birdalicious could run this until failure with the stacktrace under the debugger? This second case is potentially interesting but it would be great what was the code path leading to this assertion in realm_coordinator.cpp

for the first case it'd be great to see which address triggers access violation - is it the function pointer itself in scheduler_invoke_function or something along the invocation. The stacktrace ends it seems where dotnet code ends since native part is missing.

@nirinchev does anything in these stacktraces suggest that there is an issue in core transaction logic?

This appears to be indeed xunit's weird synchronization context. I was finally able to repro this, but if I wrap the tests in AsyncContext.Run I can no longer get a crash. I'm attaching the modified project - @birdalicious can you check if that crashes for you:

Core_7595_AsyncContext.zip

It seems for the repro project it no longer crashes with AsyncContext.Run.

I am a little surprised that this problem with xunit hasn't cropped up before as this relatively simple repro can crash the test host. Perhaps it's because the tests don't fail and just stop running.

However when I applied that change to a group of my tests in my actual project and ran only those tests on repeat I still get the crashes.
I shall attempt to create a higher fidelity repro.
And I might try to switch some of my actual tests over to NUnit to see if it still happens

I have moved a set of my tests that is known to cause the crash over to an NUnit project and I have not see any crashes reported.
I can't say for absolute certain that the tests don't crash and don't stop running but it would seem highly unlikely that the same issue occurs at all with NUnit

The problem is that the synchronization context xunit installs is inherently awkward and they switch it up at inconvenient times, making it super difficult for us to integrate nicely with it. The Nito.AsyncEx synchronization context is very well behaved and we've had consistent success using it, so I highly recommend using it instead.

Regarding crashes in your real tests, the most likely reason for those would be if you open a Realm file as part of your test setup and then try to use it in a AsyncContext.Run callback, though that should be an exception rather than a hard crash. If you're able to create a repro with xunit and AsyncContext, I can give it a go and try and figure out what's going on.

Yes we have a base class that keeps a realm instance from construction to disposal. Although the realm that is kept by the base class isn't interacted with at all. But if I remove this there are no more crashes.

On the same note: as we use the InMemoryConfiguration this base class kept the data in the realm being deleted. So for each test I would have to start the test with a using var realm = Realm.GetInstance(_config). Would this be the suggested approach? And I assume there is no way to change the async context in xunit for Nitro, we just have to wrap the function in AsyncContext.Run?

That is correct. Note that this only applies if use async/await in your tests - for synchronous tests, you don't have to wrap them in AsyncContext.Run. You can see some helpers we use in our tests here:

https://github.com/realm/realm-dotnet/blob/221d4c3f0f9fd098f67ec671637c9e4c84aa0fce/Tests/Realm.Tests/TestHelpers.cs#L296

and here for the disposal of the Realm file:

https://github.com/realm/realm-dotnet/blob/221d4c3f0f9fd098f67ec671637c9e4c84aa0fce/Tests/Realm.Tests/RealmTest.cs#L140

Essentially, rather than using Realm.GetInstance(...) we have a GetRealm function that adds all open realms to a queue, that gets disposed on test teardown. You can abstract that further and have the test automatically open an in-memory realm instance with a random identifier and dispose it on teardown.

I think then we can be satisfied that this is wholly caused by xUnit's synchronization context, as attempting to put any kind of realms logic in xUnit test constructor/dispose when using AsyncContext.Run seems to result in the test host crashing. But I know enough to stop this from happening now.

As an aside to this but related to:
https://github.com/realm/realm-dotnet/blob/221d4c3f0f9fd098f67ec671637c9e4c84aa0fce/Tests/Realm.Tests/RealmTest.cs#L122C32-L122C33
It seems when freezing and IQueryable Realm.DeleteRealm always throws Realms.Exceptions.RealmInUseException : Cannot delete files of an open Realm. I've removed the freeze to see if that was causing it and the test passed and the realm was deleted.

I know that frozen realms stick around until all uses of the realm are not used anymore, but I'm not sure what is happening in this case as it seems to just be the freeze that is making the realm still be in use. This occurs with InMemoryConfiguration and a normal realm file configuration.

The issue with freeze is that it creates a temporary frozen realm that you're not tracking anywhere, so it's up to the GC to decide when to dispose of that. If you want to deterministically be able to delete the Realm after the test is complete (which you shouldn't have to, since you're using in-memory realm), you'd need something like this:

var realm = Realm.GetInstance(...);
CleanupOnTeardown(realm);

var query = realm.All<Foo>();
var frozenQuery = query.Freeze();
CleanupOnTeardown(frozenQuery.AsRealmCollection().Realm);

// Run your test

And then on teardown, you'd need to dispose of all the live and frozen realms created during your test. You can see some of the helper methods we use in our tests here:

https://github.com/realm/realm-dotnet/blob/221d4c3f0f9fd098f67ec671637c9e4c84aa0fce/Tests/Realm.Tests/RealmInstanceTest.cs#L36-L66

I was hoping to use Realm.DeleteRealm to clean up the management folder and lock file that is created when using InMemoryConfiguration. However now I realize that the lock file isn't deleted for inter process synchronization so I shall have to find another approach to clean u[ those files.

But thank you for your insight and help, it have been very helpful.

No worries. FWIW, in our tests, we just generate a random folder where all test files will be stored and then delete it after the test run, but it's up to you to decide if that approach would work for your use case. I'm going to close this now as we have high confidence the culprit is the xunit sync context and we have a reasonable workaround.