This repository includes several examples of web scrapers and crawlers which serves the following purposes:
-
Scraping Hacker News: This example uses
requests
andBeautiful Soup
to scrape theHacker News
front page. -
Using the Hacker News API: This example provides an alternative by showing how you can use APIs with
requests
. -
Quotes to Scrape: This example uses
requests
andBeautiful Soup
and introduces thedataset
library as an easy means to store data. -
Books to Scrape: This example uses
requests
andBeautiful Soup
, as well as thedataset
library, illustrating how you can run a scraper again without storing duplicate results. -
Scraping GitHub Stars: This example uses
requests
andBeautiful Soup
to scrapeGitHub
repositories and show how you can perform a login usingrequests
, reiterating our warnings regarding legal concerns. -
Scraping Mortgage Rates: This example uses requests to scrape mortgage rates using a particularly tricky site.
-
Scraping and Visualizing IMDB Ratings: This example uses
requests
andBeautiful Soup
to get a list ofIMDB
ratings for TV series episodes. We also introduce thematplotlib
library to create plots in Python. -
Scraping IATA Airline Information: This example uses
requests
andBeautiful Soup
to scrape airline information from a site that employs a difficult web form. An alternative approach usingSelenium
is also provided. Scraped results are converted to a tabular format using thepandas
library, also introduced in this example.