Esai-Keshav / web-scraping-task

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Web Scraping Task Setup

Database setup:

Install mongodb from official website:

https://www.mongodb.com/try/download/community

Use Mongodb Compass to view scraped data. The data wll be stored in table_1 and table_2(collection) under db(database)

db
├── table_1
└── table_2

Library Used

  • joblib : For parallel processing

  • Beautiful Soup : For extracting data from HTML

Step 1:

Clone this repo with command

git clone https://github.com/Esai-Keshav/web-scraping-task

Step 2:

Install required python library with this command

pip install -r requirements.txt

Step 3:

Run this command to scrape data

Link 1:

python scrap_ajax.py

Link 2:

python scrap_forms.py

Link 3:

python scrap_advanced.py
python scrap_adv_2.py

About


Languages

Language:Python 100.0%