ManuelSch / Link-Scraper

A simple Java web scraper that collects all links on a given website that point to the same domain

Java Link Scraper

A simple web scraper implemented in Java 8.

Fetches the given entry URL
Extracts all <a href=""> tags from the HTML page source that point to the same domain as the entry URL
Repeats steps 1. and 2. with the newly found links until the whole website has been scraped
Outputs all found URLs together with their title attributes

Uses a fixed-size Thread pool for concurrent execution of each scrape request.

A simple Java web scraper that collects all links on a given website that point to the same domain

Language:Java 100.0%