ktaaaki / paper2html

Converts a single/double-column PDF formatted paper into a html page, which has the original view & the paragraph view extracted from the paper for translation from the browser.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Paragraph detection

ktaaaki opened this issue · comments

In some pdfs, paragraph detection is too fine or too coarse. Why don't you implement your own paragraph detection logic?