bsnape / ghost-crawler

a lightweight web crawler using Ghost Driver and PhantomJS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GhostCrawler

A lightweight web crawler that uses the Ghost Driver implementation of the WebDriver Wire Protocol from the PhantomJS project.

Pre-requisites

Install PhantomJS using homebrew.

$ brew update && brew install phantomjs

Make sure PhantomJS is running on port 9134.

$ phantomjs --webdriver=9134
PhantomJS is launching GhostDriver...
[INFO  - 2014-10-30T20:58:59.516Z] GhostDriver - Main - running on port 9134

Usage

TODO: Write usage instructions here

Roadmap

  1. bin directory
  2. command line arguments
  3. website dependency graph
  4. screenshot image diffing

About

a lightweight web crawler using Ghost Driver and PhantomJS

License:MIT License


Languages

Language:Ruby 100.0%