dungnn / owlcrawler

Crawl the web using Apache Mesos and Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OwlCrawler

It's a distributed web crawler that uses mesos for scheduling workers, written in Go.

Building.

Build the framework

go build -tags=testSched -o owlcrawler-framework owlcrawler_framework.go

Build the executor

go build -tags=testExec -o owlcrawler-executor owlcrawler_executor.go

Run

./owlcrawler-framework \
--master=192.168.1.73:5050 \
--executor="owlcrawler-executor" \
--artifactPort=7070 \
--address=192.168.1.73 \
--logtostderr=true

artifactPort and address point to the server that is hosting the executor, in this example, the framework has a built in http handler to serve the file

About

Crawl the web using Apache Mesos and Go

License:Apache License 2.0


Languages

Language:Go 98.5%Language:Shell 1.5%