JoshuaDavidWarner / pdf_to_html

converts all pdfs from a folder to text files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pdf_to_html

converts all pdfs from a folder to text files

In 2016 Masha Gorkovenko created a tutorial for converting PDFs to Text for Stanford.

Using a Python 2.7 and PDFMiner, I have added coding to create better text filenames.

About

converts all pdfs from a folder to text files


Languages

Language:Jupyter Notebook 100.0%