HurleyWong / uol-covid19

Scrape the coronavirus case data from the University of Leeds website and generate the .csv files, make it as an Endpoint API Service

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

uol-covid19

This repository contains

  • A CSV file of aggregated coronavirus case data scraped from the University of Leeds website
  • Python code for the scraper
  • Python code to process and analyze the data

Scraper Usage Examples

To scrape case data and write to a new file in the scraper folder:

python daily-cases.py cases.csv

or in the root folder:

python scraper/daily-cases.py cases.csv

To scrape case data and update an existing CSV file in the scraper folder:

python daily-cases.py --update cases.csv

or in the root folder:

python scraper/daily-cases.py --update cases.csv

Note: if the newly-scraped data contains records whose dates match existing records, those existing records will be replaced by the new data.

API

In order to make this project to fetch data via the API, the Flask framework has been added to this project to enable this. You can start the service by running the extract.py file and then get the number of cases for the latest date and the latest week in the following links:

  • To get the cases of latest day, access http://127.0.0.1:5000/latest in your browser
  • To get the cases of latest week, access http://127.0.0.1:5000/days in your browser

Analysis Examples

To derive day-by-day active cases from a CSV file of daily case reports in the root folder:

python analysis/active-cases.py data/2021-cases.csv

This will display the active cases on screen as a table. To output the active cases to a file, do:

python analysis/active-cases.py -o data/2021-active.csv data/2021-cases.csv

Dependencies

To use the scripts, create and activate a Python 3 virtual environment, then do:

pip install -r requirements.txt

or choose to install dependecies in this way:

pip install requests
pip install beautifulsoup4
pip install rich

About

Scrape the coronavirus case data from the University of Leeds website and generate the .csv files, make it as an Endpoint API Service

License:Apache License 2.0


Languages

Language:Python 100.0%