This repo is for analysis on the corona virus / covid-19 that will extract the latest data and generate reports. This repo will be updated daily
- Creates a time series dataset
- Creates a daily stats dataset
- Generates a number of visualizations
- You can also filter reports for a given country
- Generates an excel report including all of the above
- All results are saved to the output
reports
folder
- checkout the kanban boards to see work in progress
- You may have noticed that here are some discrepancies in the JHU data.
- These discrepancies include rows for countries missing for some sheets, misspelling of country names and countries being named differently (South Korea, Republic of Korea for example)
- I am doing my best to update the preprocessing code to fix these problems. Please be patient and I will release the newest version of covidify ASAP
pip install covidify
How to run:
$ covidify
Usage: covidify [OPTIONS] COMMAND [ARGS]...
☣ COVIDIFY ☣
- use the most up-to-date data to generate reports of confirmed cases,
fatalities and recoveries.
Options:
--help Show this message and exit.
Commands:
run
$ covidify run --help
Usage: covidify run [OPTIONS]
Options:
--output TEXT Folder to output data and reports [Default:
/Users/award40/Desktop/covidify-output/]
--source TEXT There are two datasources to choose from, John Hopkins
github repo or wikipedia -- options are git or wiki
respectively [Default: git]
--country TEXT Filter reports by a country [Default: Global cases]
--help Show this message and exit.
Example Commands:
# Will default to desktop folder
# for output and github for datasource
covidify run
# Will default to desktop folder for output
covidify run --source=wiki
covidify run --output=/Users/award40/Documents/projects-folder --source=git
# Filter reports by country
covidify run --country="South Korea"
This plots will be updated daily to visualize stats 3 attributes:
confirmed cases
deaths
recoveries
This is an accumulative sum trendline for all the confirmed cases, deaths and recoveries.
This is a daily sum trendline for all the confirmed cases, deaths and recoveries.
This stacked bar chart shows a daily sum of people who are currently confirmed (red) and the number of people who have been been confirmed on that day (blue)
A count for new cases recorded on that given date, does not take past confirmations into account.
A count for deaths due to the virus recorded on that given date, does not take past deaths into account.
A count for new recoveries recorded on that given date, does not take past recoveries into account.
A count for all the people who are currently infected for a given date (confirmed cases - (recoveries + deaths))
- The data comes from the Novel Coronavirus (COVID-19) Cases, which is a live dataset provided by JHU CSSE.
- Data available here.
- All code written by me (Aaron Ward - https://www.linkedin.com/in/aaronjward/)
- A special thank you to the JHU CSSE team for maintaining the data