k-sys / covid-19

A collection of work related to COVID-19

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Weekend reporting effect

Nectarineimp opened this issue · comments

There is a structural issue in the data because weekend cases are often reported on Monday. You can see the structure in the data. You might get more accuracy if you take it into account.

commented

Do you have data or a notebook you can provide that shows this effect and perhaps suggest a way to correct for it?

I've noticed this too--the new case counts routinely appear to drop on Sundays. Your smoothing should already be limiting any weekend effect to some extent, but down-weighting weekend counts in the smooth might be reasonable here.

Treat Saturday through Monday as one total and divide by 3. Not perfect but is spreads the issue out. If you do totals by days you will see a persistent weekend dip. This may not be true of every state, but Tennessee where I live, and Virginia have it in their data for sure.

commented

I'm in the middle of changing data sources right now to covidtracking.com - separately, the DOW effect should be mitigated because of a centered smoothing function (though it's not perfect). I'll have a look at this in the next rev

Covidtracking.com is the data source I've been using. Rates of testing/cases doesn't have a dramatic weekend effect but reported deaths absolutely does. New York & NJ are really driving the curve nationally and they seem to be the main culprits. California's reporting is crazy land in terms of "consistency".

Smoothing should mostly account for the day-of-the week effects. If you want to get fancy, you could pass the data through something like fbprophet which captures weekly 'seasonality' effects, and then back out of those effects.

I'm not sure it's worth the effort however - some basic smoothing should mostly handle this.