JCrawl
JCrawl - Java Websites Crawler
JCrawl is a basic web crawler implemented in Java, designed to scrape web pages starting from a given URL and extracting links from those pages. Web crawling is the process of navigating and extracting information from web pages, often used by search engines and web scrapers
Table of Contents
Features
- Web crawling from a starting URL.
- Specify the number of links to scrape using a breakpoint.
- Extract links from web pages.
Prerequisites
- Java Development Kit (JDK) installed on your system.
Usage
- Clone or download this repository to your local machine.
- Compile the
JCrawl.java
file usingjavac
:javac JCrawl.java
Run the porgram:
java JCrawl