desp0916 / LearnStormCrawler

Learning StormCrawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LearnStormCrawler

This has been generated by the StormCrawler Maven Archetype as a starting point for building your own crawler. Have a look at the code and resources and modify them to your heart's content.

mvn clean compile exec:java -Dexec.mainClass=net.pic.crawler.CrawlTopology -Dexec.args="-conf crawler-conf.yaml -local"

to run the demo CrawlTopology in local mode, without Storm installed.

With Storm installed, you can generate an uberjar:

mvn clean package

and then submit the topology using the storm command:

storm jar target/stormcrawler-1.0-SNAPSHOT.jar net.pic.crawler.CrawlTopology -conf crawler-conf.yaml -local

to run in local mode. Simply remove the '-local' to run the topology in distributed mode.

You can also use Flux to do the same:

storm jar target/stormcrawler-1.0-SNAPSHOT.jar  org.apache.storm.flux.Flux --local crawler.flux

About

Learning StormCrawler


Languages

Language:Java 61.5%Language:FLUX 38.5%