-
Ruby 2.7.X
-
Mechanize (gem) 2.8.X
- If the Mechanize gem is not installed, please install it by running
gem install mechanize
- If the Mechanize gem is not installed, please install it by running
In order to run the web crawler script:
- Clone this repo
- cd into the repo folder
- run
bundle install
- create a urls.txt file (the crawler will look for this file name)
- From terminal, type touch lib/urls.txt
- open the urls.txt file and type the URL you would like to crawler
- Example:
https://github.com
- If just testing, do not type a URL like
www.google.com
as it will take time to parse all the URLs found. - Save the file
- Example:
- run the script by typing ''
ruby lib/crawler.rb
'' - Check the URLs parsed in the lib/ursl.txt file
- Refactor script
- Break down crawl method.
- Make script create a file to store URLs if file does not exist
- Ask for user to input site to be crawled instead of having to edit the urls.txt file