Suor / django-cacheops

A slick ORM cache with automatic granular event-driven invalidation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Better memory limit support

Suor opened this issue · comments

For now cacheops offers 2 imperfect strategies to handle that. They both have flaws. I create this issue to track the topic.

Alternative strategies available now:

  1. Switch off maxmemory. Use external periodic job to make custom cleanup when memory usage exceeds limit. Cons: clunky, lag before eviction can cause arbitrary memory use.
  2. Use keyspace notifications and external daemon to subscribe and manage cache structure. Cons: clunky, can miss events upon disconnect, is async so eviction could delete more than needed.
  3. Store set of conj keys in a cache key and check integrity on cache fetch. Cons: slower fetch, substantial code complication.

The ideal solution would be custom eviction strategy, probably lua-based - redis/redis#2319. Another good solution could be managing cache structure with lua script subscribed to keyspace notification - redis/redis#2540.

@Suor, what are the chances that you can provide a script (or guidance on what the script needs to do) for option 1. We need to put something in place until cacheops supports a solid solution natively (hopefully option 2 or 3).

I won't provide a script, but I can elaborate on strategy:

  • use INFO MEMORY command to find out if usage is above limit,
  • select some keys with RANDOMKEY, choose conj:* from them,
  • for each conjuction key, select its members and delete those keys with conjunction key itself:
keys = redis_client.smembers(conj_key)
redis_client.delete(*([conj_key] + keys))

(last one is better to run in Lua for atomicity)

The alternative, let's call it strategy 1a is probably better:

  • use CACHEOPS_LRU = True and maxmemory-policy volatile-lru (second strategy from README)
  • periodically SCAN for conjuction keys and remove them if they are orphant:
for conj_key in redis_client.scan_iter(match='conj:*'):
    keys = redis_client.smembers(conj_key)
    exists = redis_client.execute_command('EXISTS', *keys)
    if exists = 0:
        redis_client.delete(conj_key)

(the innards of the loop should be done with Lua for atomicity)

Edit: maxmemory-policy volatile-ttl changed to volcatile-lru, which one used by second README strategy.

Hi all. I'm trying to understand what it means to use the eviction policy recommended in the README, which is CACHEOPS_LRU = True and maxmemory-policy volatile-lru. If I run my cache like this, do I lose the ability to expire cached views based on time? Is the only way to remove an item from the cache to let it get 'pushed out' by newer items?

What I want is to have my view cache expire after 24 hours like normal, BUT if I hit the max memory limit, the oldest items are pushed out to make room for the new ones.

No you don't loose ability to expire by time. The only downside is that invalidation structures can clutter your redis db over time, cache keys are still evicted by timeout.

Oh ok. So if I understand this right, two keys are created for each item that is cached, one is the actual content and the other is the invalidation instructions. With the method I mentioned the content keys will be removed, but the invalidate keys will remain? And if I run the conj_key function as a management command every so often those invalidation keys will be cleared out?

Several conj_keys refer to single cache key, here is the description of how it works. When you use CACHEOPS_LRU = True conj_keys are not evicted by time, so they may clutter up, referencing non-existing cache keys. They are still removed on corresponding events so this might be not an issue.

There is no such thing as conj_key function. You basically need to go through conj_keys and check if they refer only non-existing cache keys and remove them if they are, I wrote the draft above. It could be improved though - remove non-existing cache keys from conj key instead of checking all of them and removing the whole set only:

for conj_key in r.scan_iter(match='conj:*'):
    for cache_key in r.smembers(conj_key):
        # These two lines should be done atomically
        if not r.exists(cache_key):
            r.srem(conj_key, cache_key) 

Redis automatically removes keys for empty sets, so that's it.

Thank you for the quick reply. I’m trying out this strategy and will report back.

I know it would take a large amount of effort to do right, but I think it would be beneficial if we could configure multiple cache backends. That would make this memory limit issue also easily solvable by running multiple redis instances (which many people do already since redis is single-cpu).

This has nothing to do with other backends. BTW cacheops doesn't use other backends because it uses sets and set operations in redis, which other backends just don't provide.

You misunderstood. I'm talking about multiple redis servers so you can have memory limits through redis.

Multiple redises have nothing to do with memory limit.

I don't see why not? Youl could have multiple redis servers and you can specify the maxmemory per server separately.

For example, assuming you have your sessions in redis you want to be absolutely certain they will never reach an out-of-memory scenario. Whereas many cache layers don't haver any real priority so you can set that server to allkeys-lru so you omit the need for a setex or expire.

All this doesn't matter from cacheops implementation point of view multiple server support and memory limit are completely independent issues. There is no reason to talk about multiple servers or backends here.

Using CACHEOPS_INSIDEOUT = True is a blessed way to solve this now, see Using memory limit