paylogic / pip-accel

pip-accel: Accelerator for pip, the Python package manager

Home Page:https://pypi.python.org/pypi/pip-accel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Enable binary cache format revisions to coexist

xolox opened this issue · comments

Problem:

In pull request #33 a new cache backend was introduced that stores binary distribution archives on Amazon S3. However the CACHE_FORMAT_REVISION that is used to clear the binary cache when a change to the format is made is only stored in the ~/.pip-accel directory on the local file system.

Solution:

Similar to ~/.pip-accel/version.txt the cache format revision should be stored inside the Amazon S3 bucket so that the bucket contents can be cleared when the cache format revision changes.

Problem with proposed solution:

Atomically upgrading a cluster of nodes is unrealistic, so that means multiple pip-accel processes that use different CACHE_FORMAT_REVISION values will at some point be running simultaneously. They could end up constantly trashing each other's remote cache and worse, they may break each other.

I've actually seen this trashing of the binary cache happen locally because I have dozens of virtual environments and I don't (sometimes can't) upgrade all of them at the same time.

Proposed architectural change:

Encode the concept of cache format revisions in the top level structure (directories) of the binary cache so that multiple cache format revisions can peacefully co-exist. This could be implemented (and would be useful) both for the local binary cache and for remote binary caches.

I've thought about implementing this feature before and I guess now is the right time - before the Amazon S3 backend becomes popular :-).

One reason I was hesitant to allow multiple co-existing binary cache revisions is that there is no longer any mechanism to clear them. I honestly don't know if it's pip-accel's place to define a policy for that. But if it doesn't everyone will "accept the defaults" and have an ever growing number of ever growing binary cache directories :-) (just like now but worse, because old cache format revisions persist as well).

I believe I just solved this problem so I'll go ahead and close this issue now. For anyone curious about the details, I indeed added a level of nesting to the binary cache format so that a v7/ directory component is prefixed to binary distribution archives generated by the version of pip-accel I just published. The next cache format revision will use the v8/ directory component prefix, so that multiple versions of pip-accel can (from now on) coexist without trashing each other's binary cache. I also tested this with the Amazon S3 backend introduced in pull request #33, this works as expected as well.