kb22 / Web-Scraping-using-Python

This project scrapes Wikipedia for its articles using BeautifulSoup to create a dataset and then draws analysis on the collected data.

Home Page:https://towardsdatascience.com/dataset-creation-and-cleaning-web-scraping-using-python-part-1-33afbf360b6b

Repository from Github https://github.comkb22/Web-Scraping-using-PythonRepository from Github https://github.comkb22/Web-Scraping-using-Python

Web-Scraping-using-Python

A Jupyter notebook to scrape Wikipedia webpages using Python to create a dataset.

The complete project is detailed as a two part series:

  1. Part 1: Describes how web scraping can be used to fetch data from a website.
  2. Part 2: Describes how collected data can be cleaned before actual use.

NOTE: This project is for understanding how web scraping works on actual websites. If however, web scraping is needed on a website, proper permissions must be taken and terms and conditions must be followed.

About

This project scrapes Wikipedia for its articles using BeautifulSoup to create a dataset and then draws analysis on the collected data.

https://towardsdatascience.com/dataset-creation-and-cleaning-web-scraping-using-python-part-1-33afbf360b6b

License:MIT License


Languages

Language:Jupyter Notebook 100.0%