pkolaczk / fclones

Efficient Duplicate File Finder

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How does a change in --hash-fn impact the --cache? [documentation]

patrickwolf opened this issue · comments

It's not clear from the documentation how an existing cache is impacted by a change in --hash-fn

  • ie will files already cache be ignored (not desired)?
  • will they be re-evaluated?
  • what happens if i switch back to the old hash will it need to be redone? or does the db store both?
  • will the new hash be used for start/end hashing of a file or only for content hashing? (seems like only for content is more efficient as the pre grouping can happen via the cache)

Thanks for answering here and or updating the documentation!

related ticket: #153

from testing it seems like

  • changing the hash only effects the content hashing all other steps are still cached
  • the cache file keeps the results of multiple hash algorithms ie one can be fully cached while the other still needs to be processed

Yup, the hash algorithm is part of the key of the cache. So it maintains a separate hash cache for each algorithm.