danirockyo / nc-covid-by-zip

A repository for data published by the N.C. Department of Health and Human Services on zip code-level COVID-19 deaths and cases in North Carolina

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NC COVID-19 zip code data

This is a running repository for data captured daily by reporters from WRAL News via the N.C. Department of Health and Human Services' zip code-level map of COVID-19 cases and deaths.

Methodology

For now, this process is being completed manually, but the next steps will be to automate the process.

We're using py esri dump to capture the raw data from the map layer published daily via the State of North Carolina's ArcGIS account. The rest endpoint is here. (Note: Although DHHS first published its zip code map April 30, the agency started using this new layer on May 4. The endpoint for this map may change again).

Usage

esri2geojson https://services.arcgis.com/iFBq2AW9XO0jYYF7/arcgis/rest/services/Covid19byZIPnew/FeatureServer/0 nc_zipDATE.geojson

After the geoJSON file is captured, we can upload it to MapShaper to reduce the file size to 10 percent to speed up load times and download as a simplified geojson file.

We're also using QGIS software to export the file in CSV format.

Once DHHS publishes the new file around 11 a.m., we can process the data and update our own map accordingly.

5/20 UPDATE: DHHS changed its data dashboard on May 20 and no longer appears to be updating the data in the shapefile layer. I'm checking with the agency on whether this will change in the future. But in the meantime, the process takes a little more time.

Tableau, the platform DHHS is using to visualize its data, is not set up to allow direct downloads in a structured format, but you can download ZIP code data as a PDF. We're then using Tabula PDF to convert this PDF to a spreadsheet and matching that spreadsheet with previous ZIP code data (population count, place name, etc.) to keep the formatting consistent with past versions of the data.

The PDF produced by the Tableau download omits ZIP codes where:

  1. the population is less than 500 AND case count is less than 5
  2. the case count is zero.

All 779 N.C. zip codes, including zero case count values, are included in the post-May 20 data below. Case counts are blank where the population is less than 500 and case count less than 5.

5/23 UPDATE: Correcting the information above, N.C. DHHS is still updating its shapefile through ArcGIS. To stay consistent, I will continue to use that data in our map and provide it here (aside from the several days I missed when I was trying to gather info from the state about its plans for the data going forward, which I will try to track down and backfill).

The state considers the data on its new Tableau dashboard to be the most up-to-date zip code information. But the shapefiles used by that platform are slightly different than those used in the ArcGIS platform (ZIP codes are not standardized shapes, per se, and change often). Because both platforms provide some level of data cleaning/geocoding to place addresses in the appropriate zip codes when aggregating cases and deaths, numbers on the state's new dashboard may not align 100% with the ArcGIS data used for WRAL's map.

Data

Below are the time-series files starting with the first date of capture on May 1. We'll eventually start combining these into a single file to show growth over time.

About

A repository for data published by the N.C. Department of Health and Human Services on zip code-level COVID-19 deaths and cases in North Carolina