OctopusDeploy / Nevermore

| Public | A JSON Document Store library for SQL Server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Async, and read-only transactions

PaulStovell opened this issue Β· comments

I started adding this but it's quite a big change, so before I go much further (or submit a PR which we decide not to merge) I wanted to float these changes.

Change 1: Async support

I'd like to add async support to Nevermore. There's quite a bit of work to support this, mostly forking each method to add an async version of each. E.g.,:

        // Sync
        [Pure]
        public T Load<T>(string id) where T : class, IId
        {
            return TableQuery<T>()
                .Where("[Id] = @id")
                .Parameter("id", id)
                .FirstOrDefault();
        }
        
        // Async
        [Pure]
        public Task<T> LoadAsync<T>(string id) where T : class, IId
        {
            return TableQuery<T>()
                .Where("[Id] = @id")
                .Parameter("id", id)
                .FirstOrDefaultAsync();
        }

The only thing blocking this, which we need a decision on, is a change to which DB abstractions we use. Nevermore currently uses IDbCommand and its kin, which don't have async versions. If we change to DbCommand etc., then we can use async versions.

(DbCommand's async methods just call the sync equivalent, but SqlCommand overrides them with real async implementations)

Question: does anyone object to switching to DbCommand everywhere (where required) in Nevermore?

Change 2: Read-only transactions

In Octofront we use geo-replication to create read-only secondaries. Our rules are simple:

  • Code running in an HTTP GET request gets a connection to the read-only secondary database, which is faster, but read-only
  • Code running in an HTTP POST request gets a connection to the primary database, which is writable but further away

The only issue right now is that there's no type-safety. You can inject a transaction and call insert or update, but it will only fail at runtime. It would be nice to have two explicit types for this and let code inject the version it needs.

Currently, we have this object model:

image

I'd like to propose:

image

This allows for:

  • Code that only needs to read from the database can be explicit about that. In geo-replicated scenarios, they could read from a local read-only secondary
  • Code that intends to write data is also explicit about that

It's unlikely that code will write data without reading any data at all, so an IWriteTransaction is also an IReadTransaction. But in our case, it will read and write from the primary database.

IRelationalStore would have methods like BeginReadTransaction and BeginWriteTransaction. The current BeginTransaction would call BeginWriteTransaction to maintain backward compatibility.

Can I get πŸ‘ or πŸ‘Ž on whether you agree with this direction before continuing?

Regarding async: πŸ‘
Moving to DbCommand/SqlCommand sounds good.

Some decisions that might be worth considering (and these may be based on whether we want Nevermore to be something we are designing for general use, or whether it is just something we want to use internally):

1. ConfigureAwait(false)

Should we do the .ConfigureAwait(false) thing everywhere? Or omit it, based on the assumption that our usages of Nevermore don't need it and we could avoid the effort?

2. The API

We currently re-implement (as a decorator) many of the interfaces in Nevermore within Octopus Server. Adding a number of new public methods to the interfaces increases the amount of code we need to write in Octopus Server. Having fewer public methods makes the implementation in Octopus Server simpler.

Do we need to support both sync and async query variants within Nevermore? Would all consumers use both of them?

Another option might be to provide the async variant only. I'm assuming that in Octofront we would only use the async queries. It seems likely that at some point in Octopus Server's future, we would switch to async as well. In the meantime, we could add an extension method like

public static class TransactionExtensions
{
    public static T Load(this IQueryExecutor queryExecutor, string id) where T : class, IId
	{
		return queryExecutor.LoadAsync(id).GetAwaiter().GetResult();
	}
}

to avoid us needing to update our usages.

Regarding readonly transactions: I like the split between IReadTransaction and IWriteTransaction πŸ‘

With regards to the way you create these transaction, are you proposing the following?

class RelationalStore
{
	public IReadTransaction BeginReadTransaction(...) {}
	public IWriteTransaction BeginWriteTransaction(...) {}
}

public interface IRelationalTransaction : IDisposable, IWriteTransaction
{
	void Commit();
}

public interface IWriteTransaction : IReadTransaction {...}
public interface IReadTransaction {...}

If so, two problems arises from this pattern:

  1. You can't access .Commit() on IWriteTransaction
  2. The Write/Read transactions aren't disposable.

An obvious solution to this problem for the Write transaction would be to change the API to

class RelationalStore
{
	public IReadTransaction BeginReadTransaction(...) {}
	public IRelationalTransaction BeginWriteTransaction(...) {}
}

However problem 2. still exists for the read transaction. In fact, problem 1 might exist for the read transaction as well - it may make sense to commit a read transaction to release any locks that are held. In practice, I don't think we do this, so this might not be a real problem for us.

Making IWriteTransaction and IReadTransaction disposable is not a good solution either. In Octopus Server, we currently rely on the distinction between IQueryExecutor and IRelationalTransaction (the former being non-disposable) to help enforce which methods are only executing queries, and which methods are responsible for disposing or committing.

Here's one possible solution

public interface IWriteTransaction : IWriteQueryExecutor, IRelationalTransaction { }
public interface IReadTransaction : IReadQueryExecutor, IRelationalTransaction { }
public interface IWriteQueryExecutor : IReadQueryExecutor { } 
public interface IReadQueryExecutor { }
public interface IRelationalTransaction : IDisposable
{
    void Commit();
}

class RelationalTransaction : IWriteTransaction, IReadTransaction {...}

class RelationalStore
{
	public IReadTransaction BeginReadTransaction(...) {}
	public IWriteTransaction BeginWriteTransaction(...) {}
}

I like the object model you illustrated ^ will go with that.

@TomPeters re: ConfigureAwait(false), my understanding was that this isn't needed in .NET Core anymore? I might be out of date though.

Regarding this:

public static class TransactionExtensions
{
    public static T Load(this IQueryExecutor queryExecutor, string id) where T : class, IId
	{
		return queryExecutor.LoadAsync(id).GetAwaiter().GetResult();
	}
}

It's tempting and I considered this, but my worry was that it would slow Octopus down (as it currently only uses the sync versions). It means newing up state machines and tasks etc. for the same outcome, so it's only a performance penalty. Duplicating the methods sucks but at least it has no performance penalty for Octopus Server.

my understanding was that this isn't needed in .NET Core anymore? I might be out of date though.

This is more or less true, but Nevermore is built against netstandard (not netcore), which could be used by consumers targeting .net framework

worry was that it would slow Octopus down

My intuition is that the cost would be negligible. We could probably measure this instead of guessing :)

Good point. To test it out, I changed this in ChaosSqlCommand:

public int ExecuteNonQuery()
{
    return ((SqlCommand)wrappedCommand).ExecuteNonQueryAsync().Result;
}

public IDataReader ExecuteReader()
{
    MakeSomeChaos();
    return ((SqlCommand)wrappedCommand).ExecuteReaderAsync().Result;
}

I added some code that does 1000 inserts:

var watch = Stopwatch.StartNew();
for (var i = 0; i < 1000; i++)
{
    creator.Insert(new Product { Name = "Product " + i, Price = i, Type = ProductType.Dodgy});
}
Console.WriteLine($"Time taken: {watch.ElapsedMilliseconds}ms");

And the same for 300 loads:

var watch = Stopwatch.StartNew();
for (var i = 0; i < 300; i++)
{
    var p = reader.Load<Product>("Products-1");
    Assert.IsTrue(p != null);
}
Console.WriteLine($"Time taken for Load: {watch.ElapsedMilliseconds}ms");

The "Insert" test runs in 231ms without async/result, and 274ms with async, roughly 18% overhead.

The "Load" test went from 58ms to 68ms with the extra async call. So 17% overhead.

Those inserts/loads are very low data though, and it's running on my local SQLEXPRESS. If doing a really large query or over a network, I agree the overhead would be much less. But a lot of customers do use SQLEXPRESS on the same machine, and a lot of our queries are small (Load(id) happens on nearly every page for example), so the overhead might add up.

18% is probably the worst-case but I wouldn't be surprised if real-world it was something like 2-5% all up? Also, it's going to add more work to the GC which isn't measured here.

Shipping a release that makes Octopus even 2% slower feels like too much. That feels high enough for me to be worth the duplication/maintenance of maintaining both code paths.

(It's also a good argument for not instantly switching all Octopus controllers over to call the async version - it probably doesn't make sense for calls that simply fetch or update one document)

ConfigureAwait(false), my understanding was that this isn't needed in .NET Core

Apparently it isn't needed in ASP.NET Core anymore, but still in otherthings (like WPF) https://blog.stephencleary.com/2017/03/aspnetcore-synchronization-context.html

πŸ‘ from me

@slewis74 @Waldo000000 heads up as you are working on Nevermore as well

πŸ‘

@TomPeters are you OK with this?

public interface IReadQueryExecutor { }
public interface IWriteQueryExecutor : IReadQueryExecutor { } 

public interface IReadTransaction : IReadQueryExecutor, IDisposable { }
public interface IWriteTransaction : IReadTransaction, IWriteQueryExecutor
{
    void Commit();
}

[Obsolete("Use IReadTransaction or IWriteTransaction")]
public interface IRelationalTransaction : IWriteTransaction
{
}

The slight changes are:

  • Both I[Read/Write]Transaction are disposable
  • Only IWriteTransaction can Commit()
  • IRelationalTransaction can do everything it currently can, but it's marked as obsolete

That's fine, although I'd tend towards just removing IRelationalTransaction instead of marking it obsolete.

This is based on the assumption that we are the main users of this library and we can just update our code, rather than keeping obsolete methods around. I don't think we've been shy about making breaking changes before, although this is a significant one.

@TomPeters I put a PR together with a first cut of everything.

Re: Removing IRelationalTransaction, I'm all for it! Do you think we just remove the interface and then let whoever upgrades Nevermore in Octopus go about making the changes?

Do you think we just remove the interface and then let whoever upgrades Nevermore in Octopus go about making the changes?

I think it would be better to do this as part of this change while you have all the context in your head about what's change. It should be quite simple - effectively a rename refactor from IRelationalTransaction -> IWriteTransaction