Siegfried is a signature-based file format identification tool.
Key features are:
- complete implementation of PRONOM (byte and container signatures)
- fast matching without limiting the number of bytes scanned
- detailed information about the basis for format matches
- simple command line interface with a choice of outputs
- a built-in server for integrating with workflows and language inter-op
- power options including debug mode, signature modification, and multiple identifiers
1.4.5
sf file.ext
sf DIR
sf -csv file.ext | DIR // Output CSV rather than YAML
sf -json file.ext | DIR // Output JSON rather than YAML
sf -droid file.ext | DIR // Output DROID CSV rather than YAML
sf - // Read list of files piped to stdin
sf -nr DIR // Don't scan subdirectories
sf -z file.zip | DIR // Decompress and scan zip, tar, gzip, warc, arc
sf -hash md5 file.ext | DIR // Calculate md5, sha1, sha256, sha512, or crc hash
sf -sig custom.sig file.ext // Use a custom signature file
sf -home c:\junk -sig custom.sig file.ext // Use a custom home directory
sf -serve hostname:port // Server mode
sf -version // Display version information
sf -throttle 10ms DIR // Pause for duration (e.g. 1s) between file scans
sf -log [comma-sep opts] file.ext | DIR // Log errors etc. to stderr (default) or stdout
sf -log e,w file.ext | DIR // Log errors and warnings to stderr
sf -log u,o file.ext | DIR // Log unknowns to stdout
sf -log d,s file.ext | DIR // Log debugging and slow messages to stderr
sf -log p,t DIR > results.yaml // Log progress and time while redirecting results
By default, siegfried uses the latest PRONOM and container signatures with no buffer limits. You can customise your signature file by using the roy tool.
go get github.com/richardlehane/siegfried/cmd/sf
sf -update
Download a pre-built binary from the releases page. Unzip to a location in your system path. Then run:
sf -update
Mac Homebrew (or Linuxbrew):
brew install mistydemeo/digipres/siegfried
wget -qO - https://bintray.com/user/downloadSubjectPublicKey?username=bintray | sudo apt-key add -
echo "deb http://dl.bintray.com/siegfried/debian wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update && sudo apt-get install siegfried
- bugfix: big file handling
- bugfix: file handle leak; reported by Ross Spencer
- bugfix: mscfb; reported by Ross Spencer
- summarise os errors; requested by Ross Spencer
- code quality: vendor external packages; implemented by Misty de Meo
- fix: speed regression in TIFF mis-identification patch last release
- code quality: refactor textmatcher package
- code quality: refactor siegreader package
- code quality: documentation
- measure time elapsed with -log time
- bugfix: percent encode file URIs in droid output
- bugfix: long windows directory paths (further work on bug fixed in 1.4.2); reported by Ross Spencer
- bugfix: mscfb panic; reported by Ross Spencer
- bugfix: TIFF mis-identifications due to an early halt error
- new -throttle flag; requested by Ross Spencer
- errors logged to stderr by default (to quieten use -log ""); requested by Ross Spencer
- mscfb update: lazy reading
- webarchive update: decode Transfer-Encoding and Content-Encoding; requested by Dragan Espenschied
- bugfix: long windows paths; reported by Ross Spencer
- bugfix: 32-bit file size overflow; reported by Ross Spencer
- -log replaces -debug, -slow, -unknown and -known flags (see usage above)
- highlight empty file/stream with error and warning
- negative text match overrides extension-only plain text match
- new MIME matcher; requested by Dragan Espenschied
- support warc continuations
- add all.json and tiff.json sets
- minor speed-up
- report less redundant basis information
- report error on empty file/stream
Copyright 2016 Richard Lehane
Licensed under the Apache License, Version 2.0
Like siegfried and want to get involved in its development? That'd be wonderful! There are some notes on the wiki to get you started, and please get in touch.
Thanks TNA for http://www.nationalarchives.gov.uk/pronom/ and http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm
Thanks Ross for https://github.com/exponential-decay/skeleton-test-suite-generator and http://exponentialdecay.co.uk/sd/index.htm, both are very handy!
Thanks Misty for the brew and ubuntu packaging