aih / pdf2html

Convert pdf files to text and html (eventually MS Word, too)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pdf2html

A web-based converter from pdf to html and eventually other formats (text and MSWord). The UI is based on a Django-backed app built on jQuery-File-Upload. That JQuery app was developed by Sebastian Tschan, with the source available on Github. This was ported to Django by Sigurd Gartmann (sigurdga on github).

I connected the UI to a back-end pdf converter. For a Django app to use JQuery-File-Upload, you should branch from [here](https://github.com/sigurda/django-jquery-file-upload).

TODO: Use the terrific library for pdf to html conversion: [pdf2htmlEX](https://github.com/coolwanglu/pdf2htmlEX/wiki/Quick-Start), using ttfautohint as --external-hint-tool=ttfautohint

Conversion to Word can use pandoc

License

MIT, as the original project. See LICENSE.txt.

About

Convert pdf files to text and html (eventually MS Word, too)

License:MIT License


Languages

Language:Python 54.6%Language:JavaScript 29.1%Language:HTML 8.8%Language:CSS 7.6%