This role manages the installation of pypdfocr on a Raspberry Pi, although it should be able to manage any other Debian based system. Since this role must compile from source, it might take a while to complete the first run. Please be patient.
- pypdfocr installed using pip
- tesseract 3.03 compiled from source
- tesseract english language data
- Ansible 1.6 or higher
- Raspberry Pi (possibly other Debian based systems)
- Internet connectivity to download tesseract source
The variables below correlate directly with pypdfocr's configuration file. See the pypdfocr docs for more information.
- pypdfocr_target_folder - path to the directory where to place OCR'd PDFs
- pypdfocr_default_folder - path to place PDF's which were not automatically filed
- pypdfocr_original_move_folder - path to story original PDFs
- pypdfocr_folders - dictionary/hash of directories and key words used for automatic filing
- pypdfocr_mail_smtp_server - SMTP server to use for sending emails
- pypdfocr_smtp_login - login for the SMTP server
- pypdfocr_smtp_password - password for the SMTP server
- pypdfocr_mail_from_addr - from address
- pypdfocr_mail_to_list - array of email addresses which will receive emails when a PDF has been OCR'd and filed
To Change the value of variables, create a file in host_vars/
or group_vars/
or define variables in the playbook.
There are other options for changing variable values. See Ansible Variable Documentation for more ideas.
Include this role in your plays and set variables as desired.
---
hosts: pypdfocr-servers
roles:
- pypdfocr
This role includes Travis CI tests and a Vagrantfile. To run tests
locally, run vagrant up