dgraph-io / ristretto

A high performance memory-bound Go cache

Home Page:https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CM-Sketch: identical rows, maybe a mistake?

cocktail828 opened this issue · comments

Hi guys:

The sketch does not work the same like the reference https://github.com/dgryski/go-tinylfu/blob/master/cm4.go.
Maybe a mistake? Can anybody check this?
The implement in reference:

func (c *cm4) add(keyh uint64) {
	h1, h2 := uint32(keyh), uint32(keyh>>32)

	for i := range c.s {
                // here: rows are different because of this line
		pos := (h1 + uint32(i)*h2) & c.mask
		c.s[i].inc(pos)
	}
}

s.rows[i].increment((hashed ^ s.seed[i]) & s.mask)

If so, shall we correct the mistake in the following way. Using different seed for rows!

func newCmSketch(numCounters int64) *cmSketch {
	if numCounters == 0 {
		panic("cmSketch: bad numCounters")
	}
	// Get the next power of 2 for better cache performance.
	numCounters = next2Power(numCounters)
	sketch := &cmSketch{mask: uint64(numCounters - 1)}
	// Initialize rows of counters and seeds.
	for i := 0; i < cmDepth; i++ {
		sketch.seed[i] = rand.New(rand.NewSource(time.Now().UnixNano())).Uint64()
		sketch.rows[i] = newCmRow(numCounters)
	}
	return sketch
}