GreggRoll / PDF-crawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PDF extraction and JSON conversion

main.py runs pdf link extraction from crawler.py and conversion with pdf2json.py

future additions will be clustering based on format using clustering.py

About


Languages

Language:Jupyter Notebook 89.9%Language:Python 10.1%