psf / cachecontrol

The httplib2 caching algorithms packaged up for use with requests.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Do not delete stale cache entries if they have ETag or Last-Modified

hexagonrecursion opened this issue · comments

The whole point of ETag and Last-Modified is to allow the client to revalidate a stale cache entry. Using self.cache.set(..., expires=expires_time) means that the cache entry is deleted when it becomes stale defeating the whole purpose of ETag and Last-Modified.
https://github.com/ionrock/cachecontrol/blob/d5236827219369126d15a2f3f49432f47eac1cb1/cachecontrol/controller.py#L320-L335
I'm not sure which branch is responsible for saving responses with a Last-Modified header. Is it this one?
https://github.com/ionrock/cachecontrol/blob/d5236827219369126d15a2f3f49432f47eac1cb1/cachecontrol/controller.py#L346-L375

This currently only affects RedisCache as both DictCache and FileCache ignore expiry

https://github.com/ionrock/cachecontrol/blob/d5236827219369126d15a2f3f49432f47eac1cb1/cachecontrol/controller.py#L320-L335

Looking closer at this code I think it tries to implement the following (if the cache honors the expires parameter):

  • For the responses with freshness lifetime less than 14 days:
    • Delete them when their age reaches 14 days
    • Attempt to revalidate if they are requested after they become stale and before their age reaches 14 days
  • For the responses with freshness lifetime greater or equal to 14 days:
    • Delete them when they become stale
    • Never attempt to revalidate

On a related note: this freshness lifetime calculation is wrong: it ignores Cache-Control: max-age

This looks magical. No rationale is provided.

  • Why delete at all?
    • No matter how old the response is there is always a chance that a revalidation will succeed. It only makes sense to delete it if we are running low on space.
    • This completely ignores how much free space is in the cache, how often the key is accessed and how recently the key is accessed. This will either trigger too early (when we have plenty of space) or too late.
  • Why this specific formula?
  • Why 14 days?