vanangamudi / thaal-vetti

UI tool for splitting scanned two page pdf

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

thaal-vetti

UI tool for splitting scanned two page pdf

Usage

The tool consists for two scripts for now. `vetti.sh` and `vetti.py`.

`vetti.sh` is the main interface.

vetti.sh csm <path-to-pdf>

where, csm - chunk, split, merge

say the pdf filename is tamil.pdf,

  • chunk - splits the pdf into images (one per page) and places it under the tamil.pdf_pages/ directory
  • split - take images from tamil.pdf_pages and shows you a GUI for drawing a line, where to slice the image and place them under tamil.pdf_split_pages/ directory
  • merge - will collect all the images from tamil.pdf_split_pages/ directory and produce a pdf file

Note

  • it is preferred that for every pdf you run this tool once and finish it off in single stroke
  • but they can be independently run, but note that merge task will delete all the files created by chunk and merge

For example, in case the pdf is already chunked into pages, you can just run,

vetti.sh sm <path-to-pdf>

Configuration/Tweaking

Configuration variables are now defined at the top of the script.

For example

#### configs
# Chunking config
PAGE_RESOLUTION=150

# vetti.py config
DISPLAY_RESOLUTION=0.5
DEBUG='-D' 
CLEAR='--clear'
           

Chunking

VariableFunction
PAGE_RESOLUTIONChange the output resolution of pdf images from chunking

Vetti.py

VariableFunction
DISPLAY_RESOLUTIONChange the size of the GUI window
DEBUGdebug flag, show more windows that necessary but helpful for debugging
CLEARErases the exisitng `.vetti` file

`vetti.py` is GUI tool

`vetti.py` is the captures the core of this project. It is GUI tool based on opencv library.

It is lets you draw the line based on which the pdf-page will be sliced into two parts.

KeyFunction
`Space`save the state of the line
`q`quit
`Enter`Finished tagging and slices the image
`s`Keep the image as it is

Use the `mouse` to draw the line. Click and drag until you’re satisfied.

Once you’re fine with the line, press `Enter` to slice this page and start tagging the next page.

Note

Just in case the current page is single page, press `s` key. This will keep the page intact

Note

Since scale factor is also saved as part of the state, if you want to start afresh, pass –clear or -C switch to the `vetti.py`. This will delete old state and save the current state to `.vetti` file

Clear(–clear/-C) is different from Force(–force/-F) as in –force switch merely show you exisitng line so that you can edit –clear switch erases whole state and start a fresh.

Requirements

bash/shell

  • pdfinfo
  • convert (from imagemagick suite)
  • ghostscript

python

  • opencv

About

UI tool for splitting scanned two page pdf


Languages

Language:Python 87.9%Language:Shell 12.1%