al8n / stretto

Stretto is a Rust implementation for Dgraph's ristretto (https://github.com/dgraph-io/ristretto). A high performance memory-bound Rust cache.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset for tests and benchmarks

al8n opened this issue · comments

commented

We need some datasets that can be used to give more insight into the performance and the hit ratio when we add new features.

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

commented

You could try some widely adopted trace, for example ds1, s3 from arc paper. From my benchmark Ristretto is not good. If Stretto's implentmention is exactly same as Ristretto, I'm interested to see the results. My cache package, with benchmark results: https://github.com/Yiling-J/theine-go

The low hit ratio for ristretto in your benchmark may be caused by the write buffer, in ristretto, if you insert an item, and then try to read this item, if the item is still in the write buffer, then you will get a miss.

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

commented

I agree, but from the image in their blog post and README, the hit ratio should be higher. If you use the same technique(write to buf first), and your benchmark shows similar result, I think we can confirm that. BTW both I and ben believe that write to map first is better, you can take a look ben's reply: https://www.reddit.com/r/golang/comments/12uql3y/theine_020_released_a_generic_cache_which_has/

Yeah, writing to map first is better. I was thinking of having a method that lets the cache can read the item from the write buffer, e.g. add an Arc for the item, and have a hashmap to store the item in the buffer and remove it when the item is handled. But I do not think the idea is good enough. I have appreciated it if there is any idea about this feature.

I think it's just switching the order, first writing to map, then adding to write buffer. I think Ristretto write to buffer first because write buffer is a channel, they drop some Sets under high concurrency when channel is full. This improve write performance, but maybe not the excepted behavior for Ristretto users.