Make sure that we're using the new protobuf backed serializer

Question

Make sure that we're using the new protobuf backed serializer

Arkatufus opened this issue a year ago · comments

Gregorius Soedharmo commented a year ago

See the actual problem here: akkadotnet/akka.net#3811
Example implementation in MongoDb: akkadotnet/Akka.Persistence.MongoDB#71

Aaron Stannard · Answer 1 · Fri Mar 31 2023 00:04:21 GMT+0800 (China Standard Time)

I've had some pushback on this issue before from @to11mtm and others, but I'm generally of the opinion that using the Protobuf envelopes is a good idea so we can future proof against future schema migration changes. Open to suggestions though.

Aaron Stannard · Answer 2 · Fri Mar 31 2023 22:14:18 GMT+0800 (China Standard Time)

We'll stick with the current serialization regime for Akka.Persistence.Sql.

Drew · Answer 3 · Sat Jul 15 2023 04:25:11 GMT+0800 (China Standard Time)

Adding some flavor for those who may see this issue and have questions... also to respond to other comments here;

In short, JVM Akka tried the Protobuf backed serializer but wound up moving away from it. Last I looked at any of the 'sanctioned' plugins for JVM (i.e. before they moved away from Apache and onto BSL,) none of them used protobuf envelopes and instead had some level of optimization for the specific vagaries of their store.

Best reasons I could divine (since here we do things the way JVM did,) and I agree with these in principle as well as practice:

The way information is stored in Journals/snapshots, we already have all of the data that we care about AFA Serializer ID, Serializer Manifest, ETC.
1. IIRC we may not be handling Event Manifest yet (will need to look at the code) but do have it provisioned in the table
Using a Protobuf envelope unfortunately would be an additional layer of indirection (i.e. double-serialization and double-deserialization), and while I can't speak for the JVM world, .NET Protobuf implementations tend to be pretty allocatey and not as fast as other solutions...

but I'm generally of the opinion that using the Protobuf envelopes is a good idea so we can future proof against future schema migration changes.

I mean, they do make some things easier. Won't argue that.

However, as counterpoints:

The protobuf envelopes would lead to data dupe:
- For SQL Implementations, we already need to have PersistenceId and SequenceNumber as columns in the DB for core functionality.
- If I actually needed to troubleshoot a serialization issue, I'd definitely prefer having the SerializerId/SerializerManifest as columns like they are now vs them living in an envelope I have to decode
The kinda-nice thing about AkkaPersistenceDataConnectionFactory is it does let us handle a -lot- on our end via config flags/etc.
- And yes I know we already have a lot of... violent code in there and other parts AFA config flags, but that is almost entirely because we theoretically support 'live' migration to tag-tables (something not even the JVM did, go us!) as well as backwards compatibility with old SQL providers.
- FWIW, there -are- ways to do 'sensing' via Linq2Db. Let's imagine we added a string property, Baz into PersistentRepr. In the connection factory you could do a query like sensingContext.GetTable<JournalRow>().Select(r=>r.Baz).FirstOrDefault() in a try-catch... if it didn't throw you know the column is there and can then act appropriately, otherwise ignore column in mapping and IIRC Linq2Db can be finagled into providing a 'default' in those cases...
- Yes I still know it's not always pretty. Alas balancing performance versus code violence is often tricky.