mroswell / cdc-vaccination-history

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Home Page:https://cdc-vaccination-history.datasette.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cdc-vaccination-history

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Archives the JSON from https://covid.cdc.gov/covid-data-tracker/COVIDData/getAjaxData?id=vaccination_data every time it changes, checking three times an hour.

Watch Git scraping, the five minute lightning talk to see me live-code the creation of this repository.

This data in Datasette

The build_database.py script loops through the full commit history and uses it to build a SQLite database with a row for every daily report, mainly as a demonstration of how Python code can be used to extract data from a git scraped repository.

That database is then deployed using Datasette - you can browse the data at https://cdc-vaccination-history.datasette.io/cdc/daily_reports

You can filter down to individual states like so:

Take a look at the scrape.yml GitHub Actions workflow to see how the scraper runs, and how the data is then built into a database and published to Vercel using datasette publish.

Should you trust these numbers?

I honestly don't know. These are not coming from a documented API - I found it using the Firefox developer tools network pane. I don't know how the CDC are sourcing these. I don't know if they themselves consider them to be accurate.

All I know is that these are the numbers they are displaying on their own site - so you should treat this repository as tracking "numbers that were displayed on the CDC's website" as opposed to assuming it represents the full truth on the ground.

About

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

https://cdc-vaccination-history.datasette.io/


Languages

Language:Python 100.0%