covidatlas / coronadatascraper

COVID-19 Coronavirus data scraped from government and curated data sources.

Home Page:https://coronadatascraper.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"tested" data for Washoe County, NV is impossible

mikelehen opened this issue · comments

Location, date, and short issue description

Washoe County, NV (FIPS 32031) has impossible values for "tested" for many (all?) dates. In many cases, the delta in "cases" from day-to-day is greater than the delta in "tested", which is impossible. Since the county official website doesn't seem to report "total tests" or similar, I suspect the "tested" data is just bogus.

I don't know how you normally deal with this, but perhaps it's worth either contacting the county or blacklisting the "tested" data (and maybe re-check it every week or two?). Or if this is outside your scope, we can just continue to blacklist it on our side.

Source

The invalid "tested" entries seem to come from the underlying arcgis dataset that powers the county website dashboard, but the county website dashboard doesn't actually show "total tested" anywhere, so I think the data is probably garbage.

File

https://coronadatascraper.com/timeseries-byLocation.json (but will show up in any of the timeseries datasets for Washoe)

Issue details

The delta in "tested" from day-to-day is impossible, since it's often less than the delta in "cases". E.g.:

"2020-05-03": {
"cases": 977,
"tested": 903,
...
},
"2020-05-04": {
"cases": 988,
"tested": 906,
...
},

There were 11 new confirmed cases, but only 3 tests? This is one example, but all of the "tested" data is likely wrong (see screenshot below).

Snippet/screenshot

At https://covidactnow.org/ we are tracking "test positive rate" and for Washoe County, the current data looks like this:
image

Everything over 100% is impossible, and the rest is pretty suspect too.

@jzohrab @appastair can y'all look into this? Thanks!

I see what you mean, @mikelehen - the data is definitely off...although consistently being updated. 🤷‍♀️

{
"OBJECTID": 144,
...
"reportdt": 1589229462000,
"confirmed": 1100,
"recovered": 529,
"deaths": 39,
"active": 532,
"tested": 1035,
...
"reportedtPT": "5/11/2020 1:37 PM",
...
}

The latest daily press release reports 14,029 tests.

I've attempted to contact them via social media to see if they'll add the hidden, erroneous field to the dashboard. If that fails, I can attempt to contact them through another channel.

In the meantime, I'm not sure what the done thing is. Wait for correction or remove the field, @lazd ?

commented

I'm removing the tested field, it's bogus.

commented

Removed in PR #1045.

Thanks @mikelehen !