plar / go-adaptive-radix-tree

Adaptive Radix Trees implemented in Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Keys don't support null bytes

banks opened this issue · comments

Great job on building this library!

I was looking at the example in dump_tree.go and puzzled by the use of null byte "keys" for leaf nodes (since there is no extra byte in their key to pivot on). I'm not sure how that can result in a correct outcome.

I may still be missing something but this example shows why I think it's wrong unless it's a conscious decision not to allow null bytes in keys which isn't documented.

package main

import (
	"fmt"

	art "github.com/plar/go-adaptive-radix-tree"
)

func main() {
	tree := art.New()
	terms := []string{"A", "a", "aa", "aa\x00"}
	for _, term := range terms {
		tree.Insert(art.Key(term), term)
	}
	//fmt.Println(tree)

	// Should find "aa"
	fmt.Println(tree.Search(art.Key("aa")))
	// Should find "aa\x00"
	fmt.Println(tree.Search(art.Key("aa\x00")))

	// Expected Output (note the null byte doesn't print):
	// aa true
	// aa true
	//
	// Actual Output:
	// aa true
	// <nil> false

	// Dump output shows both the leaf "aa" stored with "key" of 0 as well as the
	// child with key 0:
	// ...
	//    │   ├── Node4 (0xc00000e2c0)
	//    │   │   prefix(0): [0 0 0 0 0 0 0 0 0 0] [··········]
	//    │   │   keys: [0 0 97 0] [··a·]
	//    │   │   children(3): [0xc00000e280 0xc00000e2b0 0xc00000e2e0 <nil>]
	//    │   │   ├── Leaf (0xc00000e280)
	//    │   │   │   key: [97 97] [aa]
	//    │   │   │   val: aa
	//    │   │   │
	//    │   │   ├── Leaf (0xc00000e2b0)
	//    │   │   │   key: [97 97 0] [aa·]
	//    │   │   │   val: aa
	// ...
}

Perhaps I'm missing something about the design that makes this expected behaviour? If leaves are going to be stored in this way (with an implicit null "key" indicating end of key/leaf). Then it seems to rule out null bytes in keys entirely? That seems like a limitation that's at least worth documenting.

It seems to stem from charAt returning 0 if the index is out of range of the key:

if pos < 0 || pos >= len(k) {
return 0
}

I wonder if this is a relic of the fact this code is based on libart - in C strings are generally null-terminated. I've not tested but it would seem that libart has a bug here too: https://github.com/armon/libart/blob/73c46356e6752cc8b0b774a5d0f4d2a6832a5ed3/src/art.c#L616

In the case where the key being inserted is a prefix of the existing internal node, this will attempt to create a new leaf and insert it at the key[offset+prefix_diff] index in the internal node. In the case described that it's reading off the end of the array since the leaf key is a prefix of the current node which means either a random char is used from some other bit of memory, a segfault occurs, or (possibly why it's not been noticed) if the string was allocated with a null terminator then the off-by-one read just always returns a null byte which is equivalent behaviour to this Go implementation.

I don't know how actively this library is maintained and I don't intend to use it directly but was referring to the implementation as I'm writing an immutable ART in Go and was confused by this.

If the answer is "don't use null byte" that's cool, just wanted to make sure I understood the intention!

Thanks!

🤔 So the ART paper pseudo code pretty much matches libart which is probably where the issue originates, and they don't describe in the text how to store leaf nodes - they are children but they have no further bytes in their key to pivot from the inner node that holds them.

Worse, I've read 3 other ART implementations in Java, C++ and Rust and they all see to have the same basic issue here which is curious. In fact most of them seem to have the same potential array-out-of-bounds issue that I think libart does. I'm beginning to suspect my analysis is wrong but it's hard to see how.

In fact the Java implementation is the only one that doesn't have a bug I can see because it does this:

            int longest_prefix = longest_common_prefix(l2, depth);
            if (depth + longest_prefix >= this.key.length ||
                depth + longest_prefix >= key.length) {
                throw new UnsupportedOperationException("keys cannot be prefixes of other keys");
            }

So it's explicitly not letting you have keys that are prefixes of each other! That's a big surprise to a user of a library like this!

I'm going to keep digging!

Speaking to Armon (who wrote libart) that library has always required that keys be null-terminated and not supported nulls in the key although that's not really documented anywhere. There are several open issues for the crash that I hypothesised about above so it seems I'm right about that!

The other Go implementation also has a work around where it forces all keys to be null terminated to work around this: https://github.com/kellydunn/go-art/blob/master/art_tree.go#L61. Your implementation is equivalent and so avoids the crash the same way but has the same limitation on keys not being allowed to be prefixes of each other.

I'll leave this here so you can decide if you want to fix it or document!

Banks, thank you very much for such a deep investigation. I'm going to fix this bug pretty soon.

I've tried different ways to fix it and finally I found one that satisfies me. See my last commit for implementation details.