Python-PDF-Scraper

Previous version

In this file, I used pdfQuery library and with the help of pdf->xml. I get the specific pdf data.

Newer version

This version used PyMuPDF and fitz library to able to extract the hightlighted text from pdf. it will require no xml conversion and is alot faster and fairly more accurate. Before running it, run the command: pip install fitz PyMuPDF

About

python scraper pdfquery pdf-scraping

Languages

Language:Python 100.0%