jeexianwu / pyspider

Integrated Solr into pyspider for indexing and searching crawl data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pyspider

A Powerful Spider(Web Crawler) System in Python. [TRY IT NOW!][Demo]

  • Write script in python with powerful API
  • Python 2&3
  • Powerful WebUI with script editor, task monitor, project manager and result viewer
  • Javascript pages supported!
  • MySQL, MongoDB, SQLite, PostgreSQL as database backend
  • Task priority, retry, periodical, recrawl by age and more
  • Distributed architecture

Documentation: http://docs.pyspider.org/
Tutorial: http://docs.pyspider.org/en/latest/tutorial/

In this fork, I integrated Solr into pyspider for indexing and searching crawl data.

About

Integrated Solr into pyspider for indexing and searching crawl data

License:Apache License 2.0


Languages

Language:Python 81.3%Language:JavaScript 9.5%Language:CSS 4.9%Language:HTML 4.2%