improve sync backend

Question

improve sync backend

dzfranklin opened this issue 8 months ago · comments

save snapshots log-style in reasonable chunks

rewrite store based on what we learned from rewriting the client? the server has different validation needs but still shouldn't be too bad. rather than pay up-front to materialize maybe we can just write an internal API to treat the store as a graph that currently works by naively just walking but in the future could be backed by a materialized version?

debug why they aren't getting closed

tell the client to come back later if the session is being created and have them seamlessly retry

optimistically compact logs of closed sessions and then flip over to actual if there were no new writes in the time period

now that we're doing fancy things with the writes we have to worry about backups to protect from bugs. what about we regularly write the entire map into s3? or we can regularly rollup our deltas into a snapshot and have that snapshot also serve as a backup to protect against broken deltas so it's validated in regular use. if we use dynamodb those snapshots would then be replicated

Daniel Franklin · Answer 1 · Tue Nov 07 2023 20:49:22 GMT+0800 (China Standard Time)

1 1kib dynamodb write per minute for a month is $0.07, so that seems like a reasonable rate for the session. reads are much cheaper

1 put per minute per month is $0.2385 for s3 for comparison, ~3.4x cost.

0.30 to store 1GB on dynamodb vs 0.02 to on s3

I'm using full usage of one map as a benchmark in part because I'm wary of session bugs that keep it running without my noticing.

but free tier: "25GB of storage, along with 25 provisioned Write and 25 provisioned Read Capacity Units (WCU, RCU) which is enough to handle 200M requests per month." for provisioned capacity

dynamodb WCUs are up to 1Kb so we'd want to pack deltas into that. but in realistic use would anything complicated be worth it if we aren't willing to delay writes?

maybe just go for it and see real life usage? this doesn't look scary scale for a few maps even if I have bugs

Daniel Franklin · Answer 2 · Tue Nov 07 2023 22:26:01 GMT+0800 (China Standard Time)

potentially write a test harness for the stores

it would generate random changesets and test both the client and server to check certain properties hold true.

write little binaries that wrap a common stdin interface over both stores
have a directory with manually written sample inputs
have another directory to write snapshots to. write to .snap.new and then the user manually copies to .snap

later would be nice to have property based testing

Daniel Franklin · Answer 3 · Tue Nov 07 2023 22:37:55 GMT+0800 (China Standard Time)

consider making session a type protected by a mutex without its own goroutine. it's users poll changes since their last.

has PutAware, GetAware, Update, GetChangesSince returning latestgeneration and changeset.

We can use generations to implement loading. It's just a special case where the latest generation is zero for a little while and then it jumps and you can ask for more.

Daniel Franklin · Answer 4 · Thu Nov 09 2023 03:27:41 GMT+0800 (China Standard Time)

addressed in #149

Daniel Franklin · Answer 5 · Mon Nov 13 2023 00:37:57 GMT+0800 (China Standard Time)

closed by #149