dgraph-io / badger

Fast key-value DB in Go.

Home Page:https://dgraph.io/badger

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Seek iterator with Reverse doesn't work

brk0v opened this issue · comments

Code:

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"os"

	"github.com/dgraph-io/badger"
)

func main() {
	dir, err := ioutil.TempDir("", "badger")
	if err != nil {
		log.Fatal(err)
	}
	defer os.RemoveAll(dir)

	opts := badger.DefaultOptions
	opts.Dir = dir
	opts.ValueDir = dir

	db, err := badger.Open(opts)
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	bkey := func(i int) []byte {
		return []byte(fmt.Sprintf("%09d", i))
	}
	bval := func(i int) []byte {
		return []byte(fmt.Sprintf("%025d", i))
	}

	txn := db.NewTransaction(true)

	// Fill in 1000 items
	n := 1000
	for i := 0; i < n; i++ {
		err := txn.Set(bkey(i), bval(i))
		if err != nil {
			log.Fatal(err)
		}
	}

	err = txn.Commit(nil)
	if err != nil {
		log.Fatal(err)
	}

	opt := badger.DefaultIteratorOptions
	opt.PrefetchSize = 10
	opt.Reverse = true

	// Iterate over 1000 items
	var count int
	err = db.View(func(txn *badger.Txn) error {
		it := txn.NewIterator(opt)
		for it.Seek([]byte("0")); it.ValidForPrefix([]byte("0")); it.Next() {
			count++
		}
		return nil
	})
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Counted %d elements", count)
}

This code returns 0 elements in reverse mode and 1000 elements with reverse = false

@brk0v That's the expected behaviour. Seek searches for a key greater than or equal to the specifed key. Say you have the following keys "00" ,"01", "02". When you seek for "01" in reverse mode the iteration would start at "00".
Similarly in your case since you are seeking for "0" it will go past all the values and you won't find anything.

Ok. Got it.
My case is to emulate some kind of buckets inside badger db. I was adding a prefix to keys. This prefix is a bucket. But with current iterator logic I can't find the last item in my "bucket".

You still can. You just need to choose the seek key wisely. Following from @janardhan1993's example, if you have [00, 01, 02], and you want to start from 01, then you can use 01~ (with tilde character). That way, the seek would stop just before 01, and it the first item would be 01.

He wanted to get the last item in the bucket, not the first, @manishrjain - I think, you misunderstood his statement.

Getting the reverse latest data from badgerdb is still Pain IA.

It's all about the choice of seek key. If you want the first data from reverse, you can just call Rewind after setting Reverse option.

There is no way to seek reverse and rewind based on prefix.

You can set reverse and then Seek to prefix + 0xFF byte. That'd bring you to prefix.

@manishrjain I realize this comment is going on 3 years old, but it's subtly incorrect, and reverse iterating over a prefix with badger is still very difficult to get right.

The method suggested in this issue: to find the last key with prefix [0x1, 0x2], set reverse=true and then Seek to [0x1, 0x2, 0xFF]
But this doesn't quite solve the problem when data exists with both the prefix and the 0xFF byte, as in the following example DB:

0x1 0x2 0x3
0x1 0x2 0x4
0x1 0x2 0xFF <----- reverse-Seek([0x1, 0x2, 0xFF])
0x1 0x2 0xFF 0x1
0x1 0x2 0xFF 0x2
0x1 0x3
0x1 0x3 0x1

I believe the solution that accounts for all cases is much more involved.

  1. You need to compute the first key lexicographically greater than [prefix, 0xFF, 0xFF, ..., 0xFF], call this increment(prefix)
    a. increment([0x1, 0x2]) = [0x1, 0x3]
    b. increment([0x1, 0xFF, 0xFF]) = [0x2]
    c. increment([0xFF, 0xFF, 0xFF]) = nil, will need to call it.Rewind() instead
    i. ^ it.Rewind() does NOT work with a prefix AND reverse=True, will need to append one more 0xFF byte than the key length
    ii. So one special case when all bytes are 0xFF: increment([0xFF, 0xFF, 0xFF]) = [0xFF, 0xFF, 0xFF, 0xFF]
  2. Then reverse-Seek on increment(prefix)
  3. Finally, adjust by calling it.Next() if you arrive at a valid key that doesn't match the prefix. In this case, you have overshot the last key with the prefix by exactly one.

DB without exact match on incremented key:

0x1 0x2 0x3
0x1 0x2 0x4
0x1 0x2 0xFF 
0x1 0x2 0xFF 0x1
0x1 0x2 0xFF 0x2 <--- reverse-Seek([0x1, 0x3]), it.Valid() == true, arrive at the correct final key for the prefix [0x1, 0x2]
0x1 0x3 0x1

DB with exact match on incremented key:

0x1 0x2 0x3
0x1 0x2 0x4
0x1 0x2 0xFF 
0x1 0x2 0xFF 0x1
0x1 0x2 0xFF 0x2 <--- after it.Next(), arrive at the correct final key for the prefix [0x1, 0x2]
0x1 0x3 <----- reverse-Seek([0x1, 0x3]), it.Valid() == false, it.Item() != nil
0x1 0x3 0x1

@AlexMackowiak I don't think what you have will work correctly for case 1c. I've tried ... it doesn't play ball. If your prefix is 0xFF and you want to iterate in reverse, without prior knowledge of the key length, I think it's impossible. It's almost like there's no choice but to reserve part of your key space for "impossible" values so that you can seek to them.
Therefore, IMO, the API / semantics for reverse iteration is quite bad, and needs overhauling.

@msackman Yeah you're actually right, I stumbled across this myself in some tests about a week later and I guess I forgot to follow up here. it.Rewind() does not work with a prefix and reverse=true how I expected it should, and I even filed an issue on the forum: https://discuss.dgraph.io/t/iterator-rewind-invalid-with-reverse-true-and-prefix-option-set/15518 but as far as I can tell this is the expected behavior?

I believe the actual case 1c that I ended up writing in my code was just to append as many 0xFF values as (key length +1), and just hope that case doesn't come up much if ever in practice (which is a reasonable assumption for my application).

@AlexMackowiak Yep. I think in conclusion it's actually impossible to seek to the last item in a database without prior knowledge of the key length. Personally, I'm switching back to BoltDB as that seems to have a much better designed API IMO.

Just wanted to post correct code for reverse iteration.

func incrementPrefix(prefix []byte) []byte {
	result := make([]byte, len(prefix))
	copy(result, prefix)
	var len = len(prefix)
	for len > 0 {
		if result[len-1] == 0xFF {
			len -= 1
		} else {
			result[len-1] += 1
			break
		}
	}
	return result[0:len]
}

// Seeks to the last key that is valid for the given prefix.  For 
// correct results iterator must be setup to iterate iterate in the
// reverse direction and must not be limited to a certain prefix
func SeekLast(it *badger.Iterator, prefix []byte) {
	i := incrementPrefix(prefix);
	it.Seek(i);
	if (it.Valid() && bytes.Equal(i, it.Item().Key())) {
		it.Next();
	}
}

Example Usage

func example(db *badger.DB, prefix []byte) error {
	return db.View(func(txn *badger.Txn) error {
		opts := badger.DefaultIteratorOptions
		opts.Reverse = true
		var it = txn.NewIterator(opts)
		defer it.Close()
		for SeekLast(it, prefix); it.ValidForPrefix(prefix); it.Next() {
			// Do stuff with item...
		}
		return nil
	})
}

Test

func Test_seekForReverseIterate(t *testing.T) {
	type args struct {
		prefix []byte
	}
	tests := []struct {
		name string
		args args
		want []byte
	}{
		{"Standard Case", args{[]byte{0x05, 0x45, 0x77}}, []byte{0x05, 0x45, 0x78}},
		{"Some 0xFF", args{[]byte{0x05, 0xFF, 0xFF}}, []byte{0x06}},
		{"All 0XFF", args{[]byte{0xFF, 0xFF, 0xFF}}, []byte{}},
		{"Empty", args{[]byte{}}, []byte{}},
	}
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			if got := incrementPrefix(tt.args.prefix); !reflect.DeepEqual(got, tt.want) {
				t.Errorf("seekForReverseIterate() = %v, want %v", got, tt.want)
			}
		})
	}
}


func getKeys(db *badger.DB, prefix []byte) ([][]byte)  {
	result := [][]byte{};
	err := db.View(func(txn *badger.Txn) error {
		opts := badger.DefaultIteratorOptions
		opts.Reverse = true
		var it = txn.NewIterator(opts)
		defer it.Close()
		for SeekLast(it, prefix); it.ValidForPrefix(prefix); it.Next() {
			result = append(result, it.Item().KeyCopy(nil));
		}
		return nil
	})
	if (err != nil) {
		log.Fatal(err);
	}
	return result;
}

func Test_SeekLast(t *testing.T) {

	// construct test database in memory
	db, err := badger.Open(badger.DefaultOptions("").WithInMemory(true));
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close();

	keys := [][]byte{
		{0xFF, 0xFF, 0xFF, 0xFF, 0xFF},
		{0xFF, 0x1, 0xFF},
		{0xFF, 0x1},
		{0xFF},
		{0x2},
		{0x1, 0xFF, 0xFF, 0xFF, 0xFF},
		{0x1, 0x1, 0xFF},
		{0x1, 0x1},
		{0x1},
	}


	err = db.Update(func(txn *badger.Txn) error {
		for _, key := range keys {
			err := txn.Set(key, []byte{});
			if (err != nil) {
				log.Fatal(err.Error());
			}
		}
		return nil;
	});
	if (err != nil) {
		log.Fatal(err);
	}

	// Function to get all keys
	empty := [][]byte{};

	// Do tests
	type args struct {
		prefix []byte
	}
	tests := []struct {
		name string
		args args
		want [][]byte
	}{
		{"Prefix [0x01]", args{[]byte{0x01}}, keys[5:]},
		{"Prefix []", args{[]byte{}}, keys},
		{"Prefix nil", args{nil}, keys},
		{"Prefix [0x00]", args{[]byte{0x00}}, empty},
		{"Prefix [0x01 0x01]", args{[]byte{0x01, 0x01}}, keys[6:8]},
		{"Prefix [0x02]", args{[]byte{0x02}}, keys[4:5]},
		{"Prefix [0xFF]", args{[]byte{0xFF}}, keys[:4]},
		{"Prefix [0xFF 0xFF]", args{[]byte{0xFF, 0xFF}}, keys[:1]},
		{"Prefix [0xFF 0xFF 0xFF 0xFF 0xFF 0xFF]", args{[]byte{0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}}, empty},
	}
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			if got := getKeys(db, tt.args.prefix); !reflect.DeepEqual(got, tt.want) {
				t.Errorf("getKeys(%v) = %v, want %v", tt.args, got, tt.want)
			}
		})
	}
}

FWIW, I believe Iterator.Seek should behave like SeekLast by default for reverse direction iterators. At least there is a simple work around using the above.

Edit - 3/20/2022

+Updated implementation and test coverage of SeekLast to correctly handle the case where incrementPrefix returns a key present in the store.
+Updated the comment on SeekLast to indicate that it will not work for iterators setup to only iterate over a specific prefix.

@PeterMcKinnis I believe this solution is currently missing one case and loses some efficiency over optimal reverse prefix scanning, great idea though we should definitely post some code for a correct implementation!

  1. When the result of incrementPrefix() is itself a key in your database, the resulting iterator will not be valid for the prefix.
    a. This is the "DB with exact match on incremented key" case from my original post, and requires an it.Next() call to fix.

  2. You did legitimately find a workaround to the correctness issue when the prefix is entirely 0xFF!
    a. This workaround is to ditch the badger Prefix IteratorOption entirely and check the prefix match at each key yourself
    i. (Adding opts.Prefix = prefix to Test_SeekLast causes two test failures)
    b. There is one downside to this though, finding the first prefix match in BadgerDB is already decently slow
    and not specifying the Prefix IteratorOption will slow it down even further

The code I ended up writing to account for everything in the reverse prefix case is the following:

options := badger.DefaultIteratorOptions
options.Reverse = true
options.Prefix = rangePrefixKey
it := tx.NewIterator(options)
...
incRangePrefix := incrementPrefix(rangePrefixKey)
if len(incRangePrefix) > 0 {
    it.Seek(incRangePrefix) // Point to the first key <= the range prefix
    if it.Item() != nil && bytes.Equal(it.Item().Key(), incRangePrefix) {
        // Exact match on incremented range prefix, need to move back one key for the last key matching the prefix
        it.Next()
    }
} else {
    // Entire range prefix was just [0xFF, 0xFF, 0xFF, ..., 0xFF], use the last key in the database
    // There's currently a bug with it.Rewind() when in reverse and with a prefix, it does not correctly find the largest key
    // https://discuss.dgraph.io/t/iterator-rewind-invalid-with-reverse-true-and-prefix-option-set/15518
    it.Seek(bytes.Repeat([]byte{0xFF}, len(rangePrefixKey)+1))
}

EDIT: Even this is sometimes invalid for the case where the prefix is all 0xFF. Our goal here is to point the iterator to the last key in the database, workarounds to it.Rewind() being broken are to either know the length of the longest key in the database or make a new iterator without the prefix option set and then it.Rewind() will correctly go to the last key in the database.

@AlexMackowiak

I updated SeekLast above to address the bug you mentioned. Good catch!

I implemented your solution but, it also fails the case where incrementPrefix is a valid key. Your implementation also works correctly except in noted edge case. It will also be faster.

@PeterMcKinnis

I replaced your SeekLast with my implementation as follows and I don't see any test failures pertaining to that increment case

func SeekLast(it *badger.Iterator, prefix []byte) {
	incRangePrefix := incrementPrefix(prefix)
	if len(incRangePrefix) > 0 {
		it.Seek(incRangePrefix) // Point to the first key <= the range prefix
		if it.Item() != nil && bytes.Equal(it.Item().Key(), incRangePrefix) {
			// Exact match on incremented range prefix, need to move back one key for the last key matching the prefix
			it.Next()
		}
	} else {
		// Entire range prefix was just [0xFF, 0xFF, 0xFF, ..., 0xFF], use the last key in the database
		// There's currently a bug with it.Rewind() when in reverse and with a prefix, it does not correctly find the largest key
		// https://discuss.dgraph.io/t/iterator-rewind-invalid-with-reverse-true-and-prefix-option-set/15518
                it.Seek(bytes.Repeat([]byte{0xFF}, len(rangePrefixKey)+1))
	}
}

I also have extensive test cases on this particular case, although they were written as part of a much larger, proprietary system that I wouldn't be able to share here.

I did not however have test coverage of the following case!

  1. When the prefix is all 0xFF
  2. but another key exists which matches the 0xFF ... 0xFF prefix but is longer than the prefix

My code was incorrect in this case, and to be truly correct you would need to know the length of the longest key in the database, as a workaround to it.Rewind() being unusable when reverse=true and prefix is set. Or alternatively make a new iterator without prefix set, so it's going to get pretty gross either way. I'll update my code posted with a note about this.