allenai / science-parse

Science Parse parses scientific papers (in PDF form) and returns them in structured form.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Few Queries

djshowtime opened this issue · comments

Is it possible to parse a pdf into text and get the following details of the paper?
"""
Title
Authors
Abstract
Sections (each with heading and body text)
"""
Is your PDFreader similar to Grobid?

The reason why I do not want to process the following modules is network issues.

  1. it is always failed when downloading the training models.
08:15:03.951 [main] WARN  org.allenai.datastore.Datastore - java.net.SocketTimeoutException: Read timed out while downloading org.allenai.scienceparse/productionBibModel-v7.dat. 6 retries left.
  1. Then, I want to try your service. However, the URL in this domain does not work for me. http://scienceparse.allenai.org
    e.g http://scienceparse.allenai.org/v1/498bb0efad6ec15dd09d941fb309aa18d6df9f5f?skipFields=sections
504 Gateway Time-out
nginx/1.4.6 (Ubuntu)