JuanMadHardy / phpWebCralwer

A web crawler in PHP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

phpWebCralwer

A web crawler in PHP.

Note that an additional file, CONFIG_db.php, is required. This sets the database server, name and password, as well as various other global options. An example file (example_CONFIG_db.php) is included.

TODO:

  • Interface the Public Suffix List, to get correct domains parsed for domains table

Prerequisites

  • PHP
  • MySQL
  • TidyHTML (php5-tidy)
  • CURL (php5-curl)
  • PDO (php5-mysql)

About

A web crawler in PHP

License:Other


Languages

Language:PHP 99.3%Language:R 0.7%