Velcorn / PDF2XLSX

A simple script to extract specific columns from a table out of a PDF and save it to XLSX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Prerequisites:

  • Python installation (see Link)
  • Pip installation (should come with the latest Python version, otherwise, see Link)

Initial Setup:

  • Run pip install -r requirements.txt
  • Modify input_path in main.py to set the path where your PDFs are saved NOTE: root folder of input_path has to exist!

Extract table columns from PDF and save to XLSX

  • Simply open a CMD, Powershell or Terminal in the root folder of the project and execute the script using python main.py.

About

A simple script to extract specific columns from a table out of a PDF and save it to XLSX

License:GNU General Public License v3.0


Languages

Language:Python 100.0%