rifatrakib / scraper-and-crawler-hub

A collection of scrapers and crawlers built with Python using numerous modules

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scraper and Crawler Hub

This repository includes several examples of web scrapers and crawlers which serves the following purposes:

  • Scraping Hacker News: This example uses requests and Beautiful Soup to scrape the Hacker News front page.

  • Using the Hacker News API: This example provides an alternative by showing how you can use APIs with requests.

  • Quotes to Scrape: This example uses requests and Beautiful Soup and introduces the dataset library as an easy means to store data.

  • Books to Scrape: This example uses requests and Beautiful Soup, as well as the dataset library, illustrating how you can run a scraper again without storing duplicate results.

  • Scraping GitHub Stars: This example uses requests and Beautiful Soup to scrape GitHub repositories and show how you can perform a login using requests, reiterating our warnings regarding legal concerns.

  • Scraping Mortgage Rates: This example uses requests to scrape mortgage rates using a particularly tricky site.

  • Scraping and Visualizing IMDB Ratings: This example uses requests and Beautiful Soup to get a list of IMDB ratings for TV series episodes. We also introduce the matplotlib library to create plots in Python.

  • Scraping IATA Airline Information: This example uses requests and Beautiful Soup to scrape airline information from a site that employs a difficult web form. An alternative approach using Selenium is also provided. Scraped results are converted to a tabular format using the pandas library, also introduced in this example.

About

A collection of scrapers and crawlers built with Python using numerous modules


Languages

Language:Python 100.0%