PDF “swiss army knife” (multitool). Translate, OCR, text extraction, etc.
git clone https://github.com/Nehc/PDFity.git
cd PDFity
docker-compose up
open http://localhost:7860/
in browser
- You have a document in PDF scan. You can get its OCR-text (in text layer). Upload the file and click OCR. You can set the language if needed.
- You receive PDF with OCR (or you have a regular pdf with text conent). You want to translate it into russian. You can do this in two ways:
- Just klick Translate (RU). But... It's not good idea because... Automatic translate is not very good: formulas, same else...
- You сan got original and translated document in one: page by page. Select eng+rus option in language and make a translate!
- You may extract text for... I don't know for what purpose, but it might come in handy! Just do it with Extract text option.
- For the best quality, the russian text can be corrected with a with spell cheker (jamspell).
- And last (for now), you can upload image, which will be automatically converted to pdf for all operations with it!