automerge-go
is an experiment in Go of implementing columnar automerge as described by Martin Kleppman (columnar experiment writeup).
*** THIS NON-FUNCTIONAL CODE ***
While the code will be functional at some point, the current intent of the code is to understand the performance characteristics of columnar automerge.
Run from mac pro desktop. https://github.com/savaki/automerge-benchmark
=== RUN TestPerformance
applying edits ...
25000: 62388 bytes, 3.1 µs/op
50000: 126464 bytes, 3.1 µs/op
75000: 189055 bytes, 4.2 µs/op
100000: 252513 bytes, 3.5 µs/op
125000: 314784 bytes, 2.9 µs/op
150000: 376359 bytes, 7.2 µs/op
175000: 439171 bytes, 3.5 µs/op
200000: 502771 bytes, 6.7 µs/op
225000: 569985 bytes, 7.5 µs/op
250000: 634307 bytes, 4.1 µs/op
edits -> 259778
bytes -> 659432
pages -> 5056
--- PASS: TestPerformance (1.52s)
automerge-go
organizes automerge operations into a series of pages. Each page currently consists of 6 fields, with data stored in columnar format:
op_counter
- together withop_actor
form the lamport timestamp of the current operation (int64)op_actor
- together withop_id
form the lamport timestamp of the current operation ([]byte)ref_counter
- counter for reference lamport timestamp (int64)ref_actor
- actor for reference lamport timestamp ([]byte)op_type
- identifies the operation to be performed (int64)value
- the data associated with the operation (varies)
Each field is encoded using one of following encodings which are borrowed heavily from parquet:
rle
- run length encoding int64delta_rle
- delta encoding on top of run length encodingdictionary_rle
- encodes []byte to an int64 and stores the int64 using run length encoding. Similar to parquet's DICTIONARY_RLEplain
- records are stored in adjancent series based on parquet's PLAIN format
rle
provides a simple run length encoding for int64 values. Each sequence of int64 is stored as a pair of values, the repetition count along with the value.
For example, the sequence [1]
would be stored as [1,1]
which you can think of as 1 repetition of 1. The sequence [1,1,1]
would be stored as [3,1]
; 3 repetitions of 1. And sequence like [1,1,1,2,2,1,1,1,1]
would be stored as [3,1,2,2,4,1]
.
delta_rle
sits atop rle
and first encodes int64 values as a series of deltas.
For example, with the sequence, [1,2,3]
, we could first delta encode this to [1,1,1]
which can be though of as start with 1, add 1 to get to the second value, 2, and add 1 to that to get to the third value, 3. [1,1,1]
can then be further reduced using rle
to [3,1]
.
As the changes of repetition get longer, delta_rle
compression gets better. For example, a sequence from 1 to 100 would encode as [100,1]
or just two bytes.
plain
encoding stores data in series, one record after the next. plain
currently supports the following logical data types:
- []byte
- string
- rune (utf8 character)
- int64
- key value pair
Underneath the logical data type is a raw data type, currently only varint and byte array. varints are stored using a dynamic number of bytes in the same manner that protobuf uses. byte arrays are stored as a varint that represents the length of the byte array followed by the bytes.
dictionary_rle
combines plain
and rle
encodings to store []byte values. dictionary_rle
first encodes the []byte using plain
encoding, ensuring there are no duplicates. dictionary_rle
then takes the order index from plain
and uses that as an int64 which is then encoded using rle
Although stored as separate columns, these two fields together form the lamport timestamp of the operation.
op_counter
is encoded using delta_rle
to take advantage of the fact that sequences of text characters will be simple sequences e.g.
op_actor
is encoded using dictionary_rle
. The expectation is that for any given page, there will likely be far fewer distinct op_actor
values than there will be operations in the page.
Also stored as separate columns, the two fields identify the causal reference for the operation.
ref_counter
is encoded using delta_rle
ref_actor
is encoded using dictionary_rle
To indicate start of document, the ref_counter
, ref_actor
pair of 0
, nil
should be used.
op_type
identifies the operation to be performed. The assumption is that each operation can be encoded into an int64 and that readers are responsible for interpreting the results.
op_type
is encoded using rle
.
value
contains the data associated with the operation and may store multiple types of data.
value
is encoded using plain
encoding.
A page
is a collection of operations. To keep pages from growing too large (and slowing down the app), pages may be split into smaller pages.
Each page maintains a bloom filter of all the op_ids contained within the page to avoid users having to search all the records to find a given element.
A node
(naming?) represents a collection of pages that share common logical and raw types. The current intent is to allow logical portions of a document like automerge Text
and Table
elements to be stored under a single node. May need to revisit this idea later.
When pages get too large, the node will split the pages into multiple smaller pages for performance reasons.
A document
represents the top level entity that the user will interact with. Documents contain a set of nodes.