inspirehep / hepcrawl

Scrapy project for feeds into INSPIRE-HEP

Home Page:http://inspirehep.net

Repository from Github https://github.cominspirehep/hepcrawlRepository from Github https://github.cominspirehep/hepcrawl

HEPcrawl

HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP (http://inspirehep.net) that focuses on automatic and semi-automatic retrieval of new content from all the sources the site aggregates. In particular content from major and minor publishers in the field of High-Energy Physics.

The project is currently in early stage of development.

See full documentation at http://pythonhosted.org/hepcrawl

About

Scrapy project for feeds into INSPIRE-HEP

http://inspirehep.net

License:Other


Languages

Language:Python 95.0%Language:HTML 4.3%Language:Shell 0.4%Language:C 0.4%