google / trillian

A transparent, highly scalable and cryptographically verifiable data store.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proposed NodeID structure

pav-kv opened this issue · comments

Proposal:

type NodeID struct {
	path string
	last byte
	bits byte
}

The whole node path consists of path bytes concatenated with 0 to 8 bits of the last byte depending on the bits field.

Advantages:

  • This type can be used as a Go map key directly, with zero copy of byte slices (string is not copied). A canonical representation must be chosen carefully: e.g. bits should always be > 0, unless this is a "zero" ID; all "unused" bits of last should be zeroed.
  • When used as a map key, it doesn't have to be byte-aligned - all non-multiple-of-8 lengths are supported automatically.
  • Zero-copy "sibling" operation can be implemented efficiently, by changing the last byte.
  • Arbitrary non-multipe-of-8 prefixes can be extracted with zero copy.
  • It takes less space than the current []byte + int implementation.
  • All the fields are hidden, and there is no real way for the client to shoot themselves in the foot when using this type.

Disadvantage (not really):

  • The path field can't be used directly, even if the bit-length is a multiple of 8, because of the canonicalization requirement that the last byte is always used. However, note that the storage layer does not request the node directly, it always needs a prefix of path to load a tile. A path prefix can be extracted with zero copy if the storage uses multiple-of-8 strata.

Potential extensions:

  • Make the last field uint64 (bits is now between 1 and 64), so that for log trees path will always be empty (hence no allocations at all).

A quick test shows a 3x speedup of the Siblings function (runs Siblings for 512 x 32byte IDs):

BenchmarkNodeIDSiblings-12     	     100	  12266448 ns/op
BenchmarkNodeID2Siblings-12    	     324	   3894282 ns/op