hshindo / PDFExtract.jl

PDF Reader based on PDFBox for Julia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PDFExtract

Build Status Build status

Requirements

  • Java 8
  • Julia 1.0

Setup

DeepFigures

Install deepfigures-open.
Note that pull request #3 should be merged before installation.

Extract figures and tables from pdf:

$ python manage.py build
$ python manage.py detectfigures [out directory] [pdf file]

Change owner of deepfigures' output:

$ sudo chown -R [username] [out directory]

PDFExtract

Install Julia 1.0.x.
Run julia and press ], then

pkg> add https://github.com/hshindo/PDFExtract.jl.git

Using PDFExtract

Put xxx.pdf and the xxxdeepfigures-results.json(deepfigures' output) in the same directory.
Then

using PDFExtract

pdfpath = "/home/xxx"
pdf2xml(pdfpath)

About

PDF Reader based on PDFBox for Julia

License:MIT License


Languages

Language:Julia 100.0%