Contains the ipython notebook for project completed as part of the CSE544.
Analyze a COVID19 + X dataset, where COVID19 dataset can be anything you want such as COVID19 data from NY, from Spain, from a given county in NY, etc. The requirement is that the dataset should span at least one month.
- Remove any outliers using the Tukey’s rule.
- Provide basic visualization of the COVID19 and X datasets to explain the general trends in data.
- Use your COVID19 dataset to predict the COVID19 fatality and #cases for the next one week. Use the following four prediction techniques: (i) AR(3), (ii) AR(5), (iii) EWMA with alpha = 0.5, and (iv) EWMA with alpha = 0.8.
- Apply the Wald’s test, Z-test, and t-test (assume all are applicable) to check whether the mean of COVID19 deaths and #cases are different from the first week to the last week
- K-S test and Permutation test
- Report the Pearson correlation value for #deaths and your X dataset, and also for #cases and your X dataset over one month of data.
- Use your X dataset to check if COVID19 had an impact on the X data.
- Check if COVID19 data changed after some local event or rule was enforced, like lockdown or stay-at-home, etc.
- Use linear regression to find the impact of age, gender, underlying conditions, etc., on the severity of covid19 symptoms or duration.
- Use Chi-square independence test to check if COVID19 impacted your X dataset in some way.