dgraph-io / ristretto

A high performance memory-bound Go cache

Home Page:https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ristretto causing memory corruption

apoggi-carecloud opened this issue · comments

After adding Ristretto to our micro-service orchestrator written in go, we have been having data corruption issues. The cached data has randomly been inaccurate, forcing me to remove this in-memory cache implementation. Not sure how I would go about logging this information. I can try and take some time over the weekend to create an example repo to exhibit the issue.

Are the values being returned by Get corrupted (as in, randomly mutated), or mismatched with other keys? If you can give me a general idea of how you're using it and the exact issue you were running into, I can invest time into this.

Values returned by Get are corrupted, but there does not appear to be a consistent pattern and the errors will appear to randomly correct itself.

@apoggi-carecloud Out of curiosity, does the cached data have anything referenced by pointers? Like slices or maps?

@johnnyfeng-bread The cached data does contain slices and maps

I might be wrong but I believe Ristretto places the data as is into a syncmap, which would mean that the original data, the cached copy, and any retrieved copies would all have the same pointer references. So a change to any one of the referenced values would change all of them. My suspicion is that's what's going on here.

OK, so concurrent access is not safe then.

Concurrent access should be safe for non-referenced types like strings or numbers. It's more that storing referenced values of any sort is not safe, whether or not it's accessed concurrently. So your data shouldn't include pointers, which means no slices or maps.

This would need someone from the Ristretto team to confirm, as I'm not 100% on the details of how this is implemented.

@karlmcguire

Concurrent access to the cache is safe (as in, Get/Set/Del are all atomic operations). Ristretto guarantees that the values returned haven't been corrupted, but if the values are pointers there is no guarantee that the underlying data hasn't been mutated (if something else holds the pointer).

If enough people need a "transaction"-like API, we can look into adding that, where you can Get-Use-Set in one atomic operation, but for now I don't think we need it.