grantjenks / python-diskcache

Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.

Home Page:http://www.grantjenks.com/docs/diskcache/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

in Cache._cull(), cull_limit appears to be prescriptive vs. a limit & in Cache.cull(), 10 files are deleted at a time

rphern opened this issue · comments

commented

This may be a case of me misunderstanding the intended use.

Cache._cull()
Expected behavior: files are deleted one at a time until cache.size < cache.size_limit OR cull_limit files are detleted
Actual behavior: cull_limit files are deleted
Code: core.py, line 916

rows = sql(select_filename, (cull_limit,)).fetchall()

Cache.cull()
Expected behavior: Same, but ignores cull limit and is manually invoked.
Actual behavior: files are deleted 10 at a time until cache.size < cache.size_limit
Code: core.py, line 2137

rows = sql(select_filename, (10,)).fetchall()
commented

I should clarify here: in the case of ._cull(), the rows define the database entries which are deleted. In .cull(), the same thing happens but in a while loop.

The behavior, as implemented, is correct and intended. I understand it can be surprising. Here are a couple reasons:

  1. _cull is called in lots of places. It needs to do a bounded amount of work and be relatively quick.
  2. _cull is intended to work in a concurrent environment. If it looped culling files then it's possible that starvation would occur as one thread is constantly culling.

By culling at least two items on every interaction, the cache will be eventually consistent. The size limits are "softer" in DiskCache than in other libraries.

cull is a way to "manually" enforce the size limit.