Text manipulation scripts, OCR, and so on...
Required tools:
- OCRmyPDF: add text to PDF
- Tesseract OCR: extract text from images
- TessData: Tesseract language files (1)
- pdftotext: extract text from PDF, extract images from PDF
- install with
chocolatey install xpdf-utils
- install with
- ImageMagick: manage images
- PDFtk: merge, split and edit bookmarks of PDF
(1) to be copied to folder C:\Program Files\Tesseract-OCR\tessdata