sachaos / jisui

Convert scanned image PDF file to text annotated PDF file

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Jisui (自炊)

This tool is PoC (Proof of Concept).

Jisui is a helper tool to create e-book.
Ordinary the scanned book have not text information, so you cannot search text from the PDF.
Jisui extract texts from a scanned book (PDF) and merge the text to PDF.

This tool is depending on Google Cloud Vision API to extract texts.
So you need GCP account & own project.

Jisui (自炊) is Japanese slung which means that scanning a book to make e-book.

Pre-requirements

Install

$ go get github.com/sachaos/jisui

Usage

$ jisui -bucket [your GCS bucket] -font [Downloaded font] -output result.pdf [scanned PDF file]

Example

You can see example PDF file.

Please download and open it in PDF viewer.

You can recongnize the difference when you search text.

image

About

Convert scanned image PDF file to text annotated PDF file

License:MIT License


Languages

Language:Go 100.0%