marchelbling / papper

A collection of python tools to crawl, store, transform open-access scholar material.

Papper

An open-source python collection of tools to

open-access scholarly documents.

The initial purpose of this project is to make arXiv more "modern" and offer

a json API without throttling
HTML version for articles available under CC licences
more metadata parsing (most arXiv articles come with their LaTeX source which is way easier to parse than PDF documents) e.g. bibliography or forumlas.

A collection of python tools to crawl, store, transform open-access scholar material.

MIT License

Language:Python 100.0%