Mr0Wido / urlcrawler.py

urlcrawler.py is a Python script that performs a web crawl for a spesific domain or domains list. This script finds all URLs under the domains.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

urlcrawler.py

urlcrawler.py is a Python script that performs a web crawl for a domain or domain list. This script finds all URLs under the domains.

Installation

git clone https://github.com/Mr0Wido/urlcrawler.py.git
cd urlcrawler.py
python3 urlcrawler.py

Usage

python crawler.py -d test.com
python crawler.py -d test.com -o urls.txt
python crawler.py -l domains.txt

Options

Flags Description
-h --help Show this help message and exit.
-d --domain The domain to crawl. Example: https://test.com
-l --list File containing a list of domains to crawl.
-o --output The output file where the found URLs will be saved.

Requirments

requests
BeautifulSoup4

Notes

This script tries to find all URLs under a specific domain. However, some URLs may be generated by JavaScript or other dynamic content and may not be found by this script. Also, this script sends a large number of requests and this can create high load on the target server. Therefore, it should only be used on your own sites or sites where you have explicit permission.

About

urlcrawler.py is a Python script that performs a web crawl for a spesific domain or domains list. This script finds all URLs under the domains.


Languages

Language:Python 100.0%