This project is a Python-based PDF preprocessing tool. It provides various operations such as removing headers and footers, marking bounding boxes, removing tables, excluding lines, fix word broken, saving the result as HTML or TXT and more.
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool