dgilland / hashfs

A content-addressable file management system for Python.

Home Page:http://hashfs.readthedocs.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Document how versioning affects the stored content

torfsen opened this issue · comments

The docs mention how the API might change between version changes, but I couldn't find anything about the data stored on disk. In particular, I'd like to know if there is a guarantee that I will be able to re-use a HashFS store on disk created with one version of hashfs with a different version of hashfs later on.

I have no plans to change the way the files are stored on disk. But if that ever did happen, it would be done in a major version change with the appropriate changelog entry. And even in that case, HashFS.repair() could be used to reindex all files under a directory and its subdirectory whenever the addressing scheme needs to change (e.g. today, if you started with md5 hashing but wanted to change to sha256 or if you wanted to change the depth or width of the folder sharding).

That sounds perfect, especially with HashFS.repair providing another layer of backwards-compatibility. It would be awesome if that information was included in the versioning documentation. I'd be happy to provide a PR but it might take a while until I find time to do it.