gabriellydeandrade / webCrawler

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python Web Crawler: Using Selenium

About

The main objective of this project is to create a crawler who could extract the title, name and url of all the products in this website: http://www.epocacosmeticos.com.br/.

Requirements

Mozilla Firefox
webdriver geckodriver
Python 3
Selenium
BeautifulSoup4
Requests

Installation

1. Clone or download this repository

You can use git to clone

git clone https://github.com/Gabrielly-Andrade/webCrawler.git

or you can download the zip package

2. Install firefox brownser and geckodriver

3. Install python3

4. Install the packages

You can install the items in this steps using pip

  • Pip

    4.1 Selenium
    pip install selenium
    
    4.2 Beautifulsoup4
    pip install beautifulsoup4
    
    4.3 Requests
    pip install requests
    

Running

After installing everything, you need to open the terminal, navigate to the right path (use cd to open the src file) and run

python crawler.py

About


Languages

Language:Python 100.0%