J535D165 / pyalex

A Python library for OpenAlex (openalex.org)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add caching

patrickmineault opened this issue · comments

It would be nice to add local caching to pyalex, e.g. using an LRU cache, such that similar requests don't need to hit the remote endpoint.

Interesting. Not sure whether we should implement this in the library or provide an example of how to do this. I like this tutorial https://realpython.com/lru-cache-python/. Maybe start with a good use-case? Do you have one?

Is there a reason this has been closed besides the missing use cases?

I'm thinking of a project for which I would like to obtain all works for a number of institutions. Let's say we use a list of ROR IDs for this and for each ID we make separate requests. If we look up the same IDs within a period of a month, we simply fetch the data locally; at the same time, if IDs have been added to the list, we fetch the data from the API. After a month, we consider the queries outdated so when we look up the same IDs, the local cache is overwritten.

It's not LRU, but use cases like these are covered by the caching implementation of the pybliometrics library. Queries are encoded as MD5 hashes.

Something worth adding to PyAlex?