grantjenks / python-diskcache

Python disk-backed cache (Django-compatible). Faster than Redis and Memcached. Pure-Python.

Home Page:http://www.grantjenks.com/docs/diskcache/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

support pandas.DataFrame in cache.memoize()

LudwigAJ opened this issue · comments

Since being one of the few choices available in Python's Dash library, and since there is heavy usage of pandas.DataFrames in general, could the .memoize() functions introduce support for these?

As of now I believe the functions simply tries to hash the DataFrame object and not its contents. Which doesn't guarantee the same hash for the same frame (data-wise).

pandas has the following function: pandas.util.hash_pandas_object which could be used to hash the contents.

The user could then specify which input parameters/arguments of decorated functions are DataFrames via an additional frames parameter.

It could work similarly to the ignore parameter. i.e. something like the following.

@cache.memoize(frames={0, 'myDF'})
def someFunc(myDF, someDate, someString):
    # do some operation(s)
    return someResult