A multi-threaded recursive crawler, implemented with apache camel. Crawling ability is really basic, only text/html content will be downloaded and stored downloaded locally.
Tweak configuration in application.properties
and use maven for building.
Package the fat JAR, from project directory:
mvn package
Then start crawler
with:
java -jar target/scraper-0.0.1.jar http://mywebsite.com
Pages will be saved to output
folder in local directory, to stop crawling use CTRL+C