fgregg / chicago-dots

The finest dots available for chicago

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chicago Dots

the finest dots for Chicago

Full Population Over-18 Population Under-18 Population
1 person per point points_full_1.geojson (54M) points_over_18_1.geojson (44M) points_under_18_1.geojson (11M)
5 people per point points_full_5.geojson (11M) points_over_18_5.geojson (8.7M) points_under_18_5.geojson (2.2M)
10 people per point points_full_10.geojson (5.4M) points_over_18_10.geojson (4.4M) points_under_18_10.geojson (1.1M)
50 people per point points_full_50.geojson (1.1M) points_over_18_50.geojson (890K) points_under_18_50.geojson (221K)
100 people per point points_full_100.geojson (560K) points_over_18_100.geojson (449K) points_under_18_100.geojson (111K)

What is this?

Dot density maps are a great way to show the distribution of countable things across a map.

The usual approach is to randomly generate N points within a boundary, where N is proportional to the number you want to show, i.e. the number of people who live in a census block or the number of people who voted for a candidate in election precinct.

But boundaries are often quite weird, and that approach can mean we put dots in the lake or the middle of the highway. If the data we are mapping is related to where people live, it would be nice to put the dots where people are likely to be. That's what this project is here to help you do, at least for Chicago.

Approach

The U.S. Decennial Census gives us high-resolution data on the number of people who live in a census "block," which often looks like a real city block in Chicago.

We start with this block data and then use dasymetric mapping to refine the block data with auxillary data.

First we use CMAP's remarkable land use data set that classifies how land is being used at the parcel level.

We then further subdivide the landuse data into the portions that intersect with buildings footprints and the portions that do not intersect with any buildings. We use non-negative linear regression to estimate the population density of the different classes of landuses in both their building-intersection and building-difference variants.

With all this data prepared, we then take the following steps:

  1. For each 2020 census block in Chicago, divide it into subareas of different land uses, building footprints, and non-empty land.
  2. Then, allocate the block population to each subarea in rough proportion to the area of the subarea multiplied by the estimated population density of that land use category.
  3. Finally, randomly generate points in each subarea.

Example: votes by precincts

These maps show the number of votes for Toni Preckwinkle in the February 2019 mayoral elections, by electoral precinct in the 5th ward. The 5th ward includes many large parks and non-residential areas. The map on the left uses uniformly random points within the precincts. The map on the right uses dasymetric dots from this project.

uniformly random dots dasymetric dots

How to use

For each area you want to N point for, you will find the dasymetric points that intersect with the area. Then take N of those points.

Here's how you might do it in PostGIS (assuming you downloaded an imported a GeoJSON file):

SELECT precinct_id, geom
FROM (
  SELECT precincts.id AS precinct_id, points_full_1.geom AS geom,
         ROW_NUMBER() OVER (PARTITION BY precincts.id) AS point_num,
         precincts.n_votes
  FROM precincts
  JOIN points_full_1 ON ST_Intersects(precincts.geom, points_full_1.geom)
) AS intersections
WHERE point_num <= n_votes;

or with Python along with geopandas and shapely:

import geopandas as gpd

# Load the polygons and points into GeoDataFrames
polygons = gpd.read_file('path/to/precincts.geojson')
multipoint = gpd.read_file('path/to/points_full_1.geojson')

# Turn into points
points = multipoint.explode(index_parts=True)

# Perform the spatial join to find the intersections
intersections = gpd.sjoin(points, polygons, op='intersects')

# Group by polygon and sample arbitrary points
arbitrary_points = intersections.groupby('index_right').apply(lambda x: x.sample(n=x.iloc[0]['TONI PRECKWINKLE'], replace=True)).reset_index(drop=True)

To build data youself

system requirements

  • wget
  • gdal
  • gdal-bin
  • libgdal-dev
  • libsqlite3-mod-spatialite
  • spatialite-bin
> git clone git@github.com:fgregg/chicago-dots.git
> cd chicago-dots
> pip install -e .
> make

Code

github

Thanks

The random point generation is adapted from Ben Schmidt's dot density code.

About

The finest dots available for chicago

License:MIT License


Languages

Language:Python 64.1%Language:Makefile 35.9%