facebook / akd

An implementation of an auditable key directory

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unbounded ```HistoryTreeNode``` field

slawlor opened this issue · comments

HistoryTreeNode has an epochs field that is used to identify the epochs which an update took place.
As it stands now, the field can grow to an unbounded size. This will pose an issue when the corresponding DB record hits either soft or hard limits imposed by the DB schema or architecture.

ESTIMATED LIMITS

Epochs are u64 = 8 bytes.

  • For mysql, with the current VARBINARY(2000) definition, 2KB / 8B = ~ 250 values (we could always raise the max value in the definition, this is just for illustration)
  • For some alternate key-value db (KVDB) with say 100KB recommended soft limit for record size, ignoring other overhead, 100KB / 8B = ~ 12500 values

SOLUTION?

The associated epochs for a HistoryTreeNode can be obtained directly from the HistoryNodeState table given the node label, since each update corresponds to a HistoryNodeState.

  • For mysql, this can be done through one call with a join, or a separate, concurrent call to the DB.
  • For the alternate KVDB, one call is also sufficient. Instead of a "get" operation to fetch a singular HistoryTreeNode record, we can use a "scan" operation to fetch both the HistoryTreeNode record AND associated HistoryNodeState records at one go, provided we have a good sharding setup.

Accumulating the data from separate tables will incur some computation time on the database server and we will need to profile and assess the impact. However, I think this is necessary to avoid the unbounded record size issue. Are there any other potential solutions I've missed?

Exporting external group conversation

@kevinlewi
Good catch. The epoch vector will indeed need to be removed. However, how would we efficiently look up the HistoryNodeState table for the appropriate epoch? I believe we will perhaps need each HistoryNodeState to have a "pointer" to the previous valid epoch for the node, right? Or are you thinking of a more efficient way to do this?
Btw, at some point it would also be great to create public-facing issues on the github to track these, even just as a placeholder. Thanks for the finds!!

@afterdusk
Hmm, when an epoch is added to the epochs field, does it not remain perpetually? My impression was that all epochs where a HistoryNodeState exists for the node are valid epochs.
Based on this assumption, the following naive query can be run in parallel to collect the epochs where a state exists for a node:

SELECT a.epoch
FROM HistoryNodeStateTable a
WHERE a.label_len = <label_len> AND a.label_val = <label_val>

(In hindsight, a join to combine this with the query to HistoryTreeNodeTable wouldn’t be feasible, so two queries are needed.)
Also thank you for pointing out GitHub issues, I’ll post there for discussions like these in future!
Edit: To be clear, I'm proposing to remove the epochs field from the database table for HistoryTreeNode, but not the struct definition. When retrieving a HistoryTreeNode from storage, we will populate the epochs field with a query to the table storing HistoryNodeState 🙂

My 2 cents.

I've actually been thinking about this before as a place to optimize. For a lookup proof & a publish, we only need the most recent state of a HistoryTreeNode correct? (@kevinlewi confirm plz :) )

What if we change HistoryTreeNode to contain the "latest" state as a part of the struct. That way at the data-layer it's constant in size, and it actually saves us a 2nd get operation generally since we (most of the time) only need the latest state. However when we write, we'll also want to push and changed states to their own table. That way for a key-history request, we can still retrieve the information (via JOIN or whatever). It might be worth changing the storage contract to support something like this (instead of just get/set generics).

For a lookup proof & a publish, we only need the most recent state of a HistoryTreeNode correct?

Yes, @Jasleen1 to confirm.

However, I think the intermediate states are needed for the append-only / key history consistency proofs. And we still need the information about which epochs a node actually gets changed in for these proofs. Otherwise, we could end up having to search for all epochs between the birth and current epochs for changes -- which may not be very efficient...

Edit: left a comment here (https://github.com/novifinancial/akd/pull/113/files/9367f0ebc7dde560dfaf47491b1b32a5bb2eff51#diff-be3234553a30392a01a72481f1676c8338a439bc2a15a8a9f00f7923d1ef9353) which further elaborates on this

Yeah @kevinlewi so in MySQL it's relatively straightforward with

SELECT `epoch` 
FROM `states` 
WHERE 
  `label_len` = :len 
  AND `label_val` = :val 
  AND `epoch` <= :epoch 
ORDER BY `epoch` DESC 
LIMIT 1

which is searching over an indexed column space with "retrieve the first epoch which is <= :epoch with a specific label" In-memory, yes it's slower, but I did a bit of a nifty trick to not scan the entire "memory db". Construct an array from

0...=epoch, map it to the NodeStateKey with the provided specific label, and retrieve those batch of nodes. Any epochs where an update didn't occur will simply return a not-found, so after we just take the largest epoch.

In MySQL, this guarantees that the query is going to return a single u64 which at least saves us on the data transport (we aren't loading an entire node, or state or something, and wasting time serializing)