attaswift / BTree

Fast sorted collections for Swift using in-memory B-trees

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proof or argument that BTree concatenation is O(log n) for arbitrary operation sequences?

jafingerhut opened this issue · comments

Thanks for publishing this library.

I have been looking for something like this, including "Relaxed Radix Balanced Trees" (RRB trees), trying to determine precisely how to guarantee O(log n) time for arbitrary sequences of operations that include concatenating two B-trees used to represent integer-indexed arrays. The challenge seems to be to guarantee that the height/depth of the tree does not exceed O(log n) for an arbitrary sequence of concatenate, split, append, prepend, etc. operations.

Does this library, and/or your book "Optimizing Collections", contain any kind of proof or argument that this is the case?

For example, do you have any kind of invariants on the "shape" of the BTree data structure that is always true, regardless of the sequence of operations that produced them, that guarantee the depth remains at most O(log n)?

The depth of a B-tree is guaranteed to be logarithmic by virtue of the way it grows and the minimum fill factor of its nodes. My book does contain a very informal analysis, but this isn't its primary focus. A more detailed discussion can be found in Bayer & McCreight's original paper, or any decent data structures textbook.

The join/split operations aren't usually described or analyzed in detail; however, it should not be too difficult to prove they maintain the B-tree invariants. I'm sure someone published a paper on these in the past 50 years -- it could be worth a search.