pcschreiber1 / PDF_Extraction-Translation

Translate many large PDF Reports for free using Python.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Translate long PDF-Reports in Python.

Style Continuous Integration codecov

Translate many large PDF Reports for free using Python. You can find the corresponding Towards Data Science article here or follow the Jupyter Notebook Article_PDF-Translation - the Central Bank Report is stored in src/examples.

This repo stores the pipeline developed for work, where a large number of official reports from different OECD countries had to be translated. To translate free of charge, the GoogleTranslate API is used. The main python packages are: pdfplumber, deep_translator, and pyfpdf2.

About

Translate many large PDF Reports for free using Python.

License:MIT License


Languages

Language:Jupyter Notebook 53.5%Language:Python 46.5%