sauramandal / web-scraping

Web Scraping & Crawling - for beginners

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Web Scraping

Web scraping consists in gathering data available on websites. This can be done manually by a human user or by a bot. The latter can of course gather data much faster than a human user and that is why we are going to focus on this. Is it therefore technically possible to collect all the data of a website in a matter of minutes this kind of bot.

Prerequisites

  • python 2.7+
  • requests
  • beautifulsoup4

Websites used

References

  • Webscraping with Python - Ryan Mitchell PDF

Copyright

Copyright Infringement: In most jurisdictions, web scraping is legal, but using copyright data contains certain restrictions. Violation of the Computer Fraud and Abuse Act (CFAA): This law, enacted to prevent computer hackers, prevents fetching data by getting unauthorized access to a page. Trespass to Chattel: Here, a chattel (or data) is violated if the website server is hurt in any way. Thus, trespass to chattel is violated if the server slows or stops because of the scraping.

About

Web Scraping & Crawling - for beginners


Languages

Language:Jupyter Notebook 99.7%Language:Python 0.3%