SmartWebCrawler

it is an easy Web Crawler With Java and Python

Easy Web Crawler With Java

it is an easy Web Crawler with Java.

Language：Java
lib：Jsoup
Current latest version：v2.0

Key Point:

Language：Python

third-party lib：urllib,beautifulsoup4

If you want to crawl all the a tags of a URL, then you may try this beautifulsoup4 based crawler project I wrote.

https://www.python.org/downloads/

https://www.crummy.com/software/BeautifulSoup/bs4/download/4.6/

python -m pip install --upgrade pip

pip install bs4

The URL written in the default request code, records all the URLs directed by the a href tag in the UR

python SmartWebCrawler.py

type the URL from the command line, and record all the URLs directed by the a href tag in the URL.

python SmartWebCrawler.py http://www.runoob.com/

more deatail please check the article as below:

it is an easy Web Crawler With Java and Python.

Language:Java 81.3%Language:Python 18.7%