nytimes / covid-19-data

A repository of data on coronavirus cases and deaths in the U.S.

Home Page:https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Issue: Death cases of Arizona on 7th, 8th and 9th March are different from other sources.

gauravjoshijo opened this issue · comments

Describe the issue:

  • Suspicious number of deaths

Fuller details

Death cases of Arizona, US on 7th Mar, 8th Mar and 9th Mar are 0, 0, 150 respectively, whereas as per other sources:

  • Worldometer:
    Number of death cases on 7th, 8th and 9th March are 13, 13, 14 respectively.
  • USFacts:
    Number of death cases on 7th, 8th and 9th March are 0, 0, and 382 respectively.
  • Ycharts::
    Number of death cases on 7th, 8th and 9th March are 0, 0, and 382 respectively.

Please check the NYT source as data for the mentioned dates seem suspicious.

Arizona made some changes to their Tableau dashboard embeds around the start of the month, including dropping to "weekly updates".

The changes rolled out on the 2nd of March -- we lost ability to extract their latest without patching our system. We had it fixed within a day, but that was into the window in which they did not update any figures until the next week. Our next recorded increases in cases and/or deaths was on the morning of March 9th, matching their new cadence.

Our methodology is to include cases and/or deaths the day they are announced by local health officials, rather than trying to backdate reports across the previous days. In many cases, we don't have data to do any backdating, even if we had the staffing.
This is partly why our own visualizations emphasize rolling averages rather than day-by-day differences. Health department reports are not as consistent as we need to fully rely on those figures.

We believe other covid-19 data sources do perhaps backdate with publicly available data, if the local health departments offer histograms or other timeseries figures. Arizona does not do that in a clear way. And if they did, we would primarily use that to correct large anomalies rather than smooth out increases that we know are already handled by rolling averages.