grantjenks / python-diskcache

Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.

Home Page:http://www.grantjenks.com/docs/diskcache/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use case? Persistent concurrent set.

kmatt opened this issue · comments

Need to keep track of a list of files processed by a multiprocessed application. A Redis SET would work, but I would prefer not to manage a separate process daemon. Redislite has been a bit problematic leaving orphaned processes on app termination.

This module may be a solution when using a Cache with no expirations or evictions? I need the set to be kept permanently.

One thing not clear is if an Index would be better suited. A dictionary with an unused value (1) could emulate a set. My current Cache POC uses incr() to track when files have been queued multiple times as a possible process logic error.

This module may be a solution when using a Cache with no expirations or evictions? I need the set to be kept permanently.

Sure, that's reasonable.

One thing not clear is if an Index would be better suited.

Seems better.

A dictionary with an unused key (1) could emulate a set.

I think you mean an unused value.

See also: https://grantjenks.com/docs/diskcache/case-study-web-crawler.html

I think you mean an unused value.

Correct, updated question. Web crawler case study is instructive, thanks!

Is there documentation that describes when an Index() is not a good option? Or as an extension of Cache() is it suitable in all equivalent cases?

https://grantjenks.com/docs/diskcache/tutorial.html#index

Index() is simply using Cache() under the hood. Index() follows the Mapping API in Python.