cockroachdb / pebble

RocksDB/LevelDB inspired key-value database in Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

why sync mode is very slow?

millken opened this issue · comments

go version: go1.20
os: debian12, mac m1
pebble: 1.0.0, 1.1.0

no batch: 3m17.114011283s
batch: 282.207716ms

func TestPebble(t *testing.T) {
	require := require.New(t)
	opts := &pebble.Options{}
	db, err := pebble.Open("test2.db", opts)
	require.NoError(err)
	defer func() {
		db.Close()
		os.RemoveAll("test2.db")
	}()
	testNum := 100000
        bytes100 := bytes.Repeat([]byte("a"), 100)
	t.Run("no batch", func(t *testing.T) {
		time1 := time.Now()
		for round := uint32(0); round < uint32(testNum); round++ {
			h := fnv.New32a()
			buf := make([]byte, 4)
			binary.LittleEndian.PutUint32(buf, round)
			h.Write(buf[:])
			k := h.Sum(nil)
			if err := db.Set(k, bytes100, pebble.Sync); err != nil {
				t.Error(err)
				return
			}
		}
		t.Logf("spent: %s", time.Since(time1))
	})
	t.Run("batch", func(t *testing.T) {
		time1 := time.Now()
		wb := db.NewBatch()
		for round := uint32(0); round < uint32(testNum); round++ {
			h := fnv.New32a()
			buf := make([]byte, 4)
			binary.LittleEndian.PutUint32(buf, round)
			h.Write(buf[:])
			k := h.Sum(nil)
			if err := wb.Set(k, bytes100, pebble.Sync); err != nil {
				t.Error(err)
				return
			}
		}
		if err := wb.Commit(pebble.Sync); err != nil {
			t.Error(err)
		}
		t.Logf("spent: %s", time.Since(time1))
	})
}

os: debian12, mac m1

You're running debian on an m1 mac? Historically fsync latency is extremely poor on macs, but I thought it was a consequence of macOS, not necessarily the physical hardware.

os: debian12, mac m1

You're running debian on an m1 mac? Historically fsync latency is extremely poor on macs, but I thought it was a consequence of macOS, not necessarily the physical hardware.

Both debian12 (linux amd64) and m1 (darwin arm64) were tested separately. The results are similar, and the sync performance is poor

Poor relative to what? If you time just the Commit in the batch case, what do you get? You're hashing 100,000 numbers, building a batch mapping them to 100-byte values, committing and syncing them in 282.207716ms.

If you'd like to tune, see the Options documentation.

The same test code, replacing db with boltdb or badger, the 'no batch 'case takes about 4s, while pebble requires 3m17s, which I can't understand.

I can only assume they're not synchronously syncing to durable storage. 100,000 synchronous commits in sequence is 100,000 disk writes. If each disk write takes 500 microseconds, you're looking at least 50s.