exponential-decay / digital-preservation-stage-boss-one

Checksum calculation and format identification performance results for improving dynamic management of digital repositories.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

digital-preservation-stage-boss-one

Checksum calculation and format identification performance results for understanding impact when managing digital repositories.

Intro

Digital preservation 101 says we need two things. Checksums for our digital content and an idea of the range of file formats that the content is encoded in.

If we are to ensure the longevity of our cultural heritage then this is stage boss number one we need to compete against.

To do that effectively... we should probably understand how the components are going to perform.

Here we present scripts to produce timings for:

  • MD5 calculation
  • SHA1 calculation
  • DROID identification using various performance settings
  • Siegfried identificaiton using various performance settings

Methodology and results are presented in more detail here: http://openpreservation.org/blog/2016/08/22/digital-preservation-stage-boss-one-the-performance-of-file-format-identification-tools-vs-checksum-generation-tools/

About

Checksum calculation and format identification performance results for improving dynamic management of digital repositories.


Languages

Language:HTML 99.3%Language:Python 0.5%Language:Go 0.1%Language:Batchfile 0.0%