db47h / dmtx-backup

Plain paper backup/restore solution using libdmtx

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dmtx-backup

Plain paper backup/restore solution using libdmtx

Backup any file with dmtx-backup, restore it with dmtx-restore.

Requirements:

  • [GhostScript] 1 9.05
  • [ImgaeMagick] 2 6.6.9
  • [libdmtx] 3 and libdmtx-utils 0.7.2

Note that the scripts where developped using the versions above (available in Ubuntu 12.04). Earlier versions of these tools may work, but have not been tested.

dmtx-backup

dmtx-backup will read the data to be backed up from the standard input and generate a printable, possibly multi-page, PDF document to the standard output. dmtx-backup can store up to 6220 bytes of data per page, spread across 4 [DataMatrix 2D barcodes] 4.

Usage:

$ cat data-to-backup | dmtx-backup BACKUP-ID >output.pdf

The BACKUP-ID is purely informational and will be printed on the generated output PDF document in order to help you identify to which backup set a printed sheet of paper belongs to. dmtx-backup also prints the md5sum of the data encoded in each barcode below it and the md5dum of the data encoded in a page at the bottom of it.

dmtx-restore

THIS IS WORK IN PROGRESS. See the BUGS / TODO section at the bottom of the page.

dmtx-restore scans multiple pages form any scanner supported by [Sane] 5, pre-processes the raw scans with ImageMagick, then uses libdmtx to read the barcodes and outputs the read data to standard output. dmtx-restore expects raw images as generated by *dmtx-backup, that is 4 barcodes per page.

dmtx-restore will generate a fair amount of intermediary files in the current directory and overwrite any existing files without warning. I strongly recommend to create a temporary folder and set the current working directory to this temporary folder before using it.

Before using, also make sure to customize the SCANPARAMS variable at the beginning of the script to adjust resolution and color mode. Since these parameters depend on your hardware, run the following command for a list of parameters supported by your hardware:

scanimage -A

dmtx-restore usage:

dmtx-restore prefix [timeout [threshold [gamma_range]]] >output_file

ARGUMENTS

prefix:      Base file name for all generated files.
             prefix-N.tiff:  raw scan of page N
             prefix-N.png:   trimmed and deskewed full page scan
             prefix-N-B.raw: data read from barcode B on page N
             DO NOT use prefixes starting with a dash (-).
timeout:     Timeout in milliseconds before dmtxread gives up scanning an
             image. Default: 2000.
threshold:   Gray to black and white threshold. Pixels with luminance above
             the threshold value are set to white, others are set to black.
             Default: 50.
gamma_range: If dmtxread fails to read an image, try adjusting the gamma
             within +/- gamma_range/10. i.e. With the default gamma_range
             of 3, gamma adjustments from 0.7 to 1.3 will be tried.

Full example:

$ mkdir restore.tmp
$ cd restore.tmp
$ dmtx-restore scan 2000 30 5 >key.raw

Place page no. 1 on the scanner.
Press <RETURN> to continue.
Press Q to terminate.
Press S to skip this page (scan-1.tiff already exists)

Scanning page 1...
Progress: 100.0%

Place page no. 2 on the scanner.
Press <RETURN> to continue.
Press Q to terminate.

Scanning page 2...
Progress: 100.0%

Place page no. 3 on the scanner.
Press <RETURN> to continue.
Press Q to terminate.
Q

Processing page 1/2...
scan-1-1: GAMMA=0.9 THRESHOLD=30
scan-1-2: GAMMA=0.9 THRESHOLD=30
scan-1-3: GAMMA=0.9 THRESHOLD=30 -despeckle
scan-1-4: GAMMA=0.9 THRESHOLD=30
Processing page 2/2...
scan-1-1: GAMMA=0.8 THRESHOLD=30
scan-1-2: GAMMA=0.9 THRESHOLD=30
scan-1-3: GAMMA=0.9 THRESHOLD=30 -despeckle
scan-1-4: GAMMA=1.0 THRESHOLD=30

$ md5sum key.raw 
286c70a81f74c95dc233ed11e9e3bc3e  key.raw

In the terminal output, dmtx-restore prints the settings used to successfully decode a barcode. In the above example, gamma values below 1.0 indicate that a higher threshold might have given better results. "-despeckle" means that it had to use the -despeckle filter in ImageMagick.

This does not matter much if everything has been read successfully, but if you need to retry, just rerun dmtx-restore with different settings. During the scan process press S to skip the pages that worked fine, ENTER to rescan failed pages (optional if you just want to try different settings) and Q to start the decoding process. During the decoding process, if a barcode has already been decoded successfully (i.e. the corresponding file PREFIX-N-M.raw exists and has a non zero length), it will be skipped. Delete PREFIX-2-4.raw do delete data read for a barcode 4 of page 2, PREFIX-1-* to delete everything for page 1 and PREFIX-* to restart from scratch.

Use low thresholds (30 or lower) if the printout is faded or draft quality, and high thresholds (70 or higher) for high quality printouts (this can help filter out noise or smudges).

Although gamma adjustment is not absolutely necessary (you can specify 0 here), it can help you identify ideal threshold settings should you need to retry a partially failed scan.

You will need to manually cleanup the intermediary files generated by dmtx-restore (i.e. rm prefix-*). This is on purpose and meant to help diagnose problems and adjust parameters.

Note about automatic document feeders

Tests with the automatic document feeder on an HP OfficeJet Pro 8600 gave mixed results even at 300 dpi. I had to rescan pages most of the time. Using the flatbed scanner worked however 100% of the time. I doubt that this issue is specific to my hardware but rather to the way automatic document feeders work.

If you really want to use an automatic document feeder, you will either need to modify the script in order to run scanimage in batch mode, or place the pages one by one in the feeder (i.e. feed page #1, hit RETURN, wait for scan to finish, feed page #2, hit RETURN, etc.).

BUGS / TODO

If there are less than 4 barcodes on a page (normal situation if backed up less than 4666 bytes), dmtx-restore will still attempt to decode the white areas (and issue errors from ImageMagick). Need to figure out a way to detect blank areas, or add a command line parameter to give the # of barcodes on the last page.

About

Plain paper backup/restore solution using libdmtx

License:GNU General Public License v3.0


Languages

Language:Shell 100.0%