hassanaftab93 / Web-Crawler

A pybot to crawl a given website and extract all URLs from it.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Web Crawler

Python application CodeQL

How it Works

Basic Idea of this python bot is to extract all URLs present on a website.

Steps

Clone this Repo

git clone https://github.com/hassanaftab93/Web-Crawler.git

Create a Virtual Environment for this Project

python -m venv venv

Activate Virtual Environment for this Project

Windows:

source venv/Scripts/activate

Linux:

source ./venv/bin/activate

Install the Required Libraries

pip install -r requirements.txt

Run the File

  sh crawl.sh

Contributing

Contributions are always welcome!

Authors

🔗 Links

portfolio

linkedin

About

A pybot to crawl a given website and extract all URLs from it.

License:Other


Languages

Language:Python 79.4%Language:Shell 18.9%Language:Procfile 1.7%