beefoo / subway-inequality

Sonification of Income Inequality on the NYC Subway

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sonification of Income Inequality on the NYC Subway

This is a set of scripts that generate songs based on median household income data along different subway trains in New York City. This is extended from a song I produced as the Data-Driven DJ in 2015. For more information about how this song was created, visit the project page on the Data-Driven DJ website.

This codebase produces music in the same way as the Data-Driven DJ project referenced above, but improves the process of generating new songs based on new data (in this case, 2017 American Community Survey (ACS) census data) and adds support for generating songs for any subway line.

Data sources

I generated a simple visualization that combines the Census tract data with income data.

Requirements for generating music and visualization

  • Python 3 (developed using Python 3.6, but likely 3.5+ should work)
  • Numpy
  • Pydub - For audio manipulation

Only required for visualization

The song comes with a basic visualization showing where you are in New York City at any given time in the song. This step requires a few more libraries:

  • Pillow - For image generation
  • Gizeh - For vector graphics. Requires Cairo to be installed
  • FFmpeg - For encoding the video file

Only required for preprocess step

This is only necessary if you're attempting to generate songs based on different or new data (i.e. not the 2017 data in the repository)

  • Shapely - For geometric calculations (only required for preprocess.py step)

Preprocessing new data

This repository already contains preprocessed data from the 2017 American Community Survey (ACS). If you have a different dataset obtained from the Census, you can do the following to preprocess the data. Otherwise, you can skip this step.

python preprocess.py -census "data/YOUR_DATA_FILE.csv"

This script does the following:

  1. Reads median household income data via the Census broken up by census tract
  2. Reads 2010 NYC Census tract data and determines lat/lon coordinates for each tract
  3. Reads MTA subway station data which contains the lat/lon for each station
  4. Matches each subway station to the two closest census tracts
  5. Takes the weighted mean of the median household income from the two tracts, weighted by distance from the station. This is to account for a station that may be at the edge of two tracts or two stations that exist in the same tract.

This will generate a .csv file for each of the subway lines in the folder data/lines/{LINE SYMBOL}.csv that contains a column income that represents the median household income of the station's surrounding area (census tracts.)

These files have already been processed for 2017 data here.

Generating music and visualization

The following script generates both the audio and visuals for a single subway line and compiles it into a video. At the least you need to indicate a subway line's .csv file (-data) and an image that represent's the subway's bullet symbol (-img).

python make.py -data "data/lines/7.csv" -img "img/7.png"

If you just want the audio, you can run:

python make.py -data "data/lines/7.csv" -ao

Sometimes if you are creating a song using an express train, you might want to include the local stops as implicit data points between express stops. In this case, you will not see labels for the local stops, but they will be used to add more nuance between express stops which can span a long distance. Here's a command where you are creating a song based on the 2 train with local 1 train stops between express stops:

python3 make.py -data "data/lines/2.csv" -loc "data/lines/1.csv"

A large number of options are available for tweaking the end result. You can find their descriptions by running

python make.py -h

Conversion to .webm format

With target 400Kb bitrate:

ffmpeg -i subway_line_7_loop.mp4 -c:v libvpx-vp9 -b:v 400K -pass 1 -an -f webm /dev/null && \
ffmpeg -i subway_line_7_loop.mp4 -c:v libvpx-vp9 -b:v 400K -pass 2 -c:a libopus subway_line_7_loop.webm

For Windows:

ffmpeg -i subway_line_7_loop.mp4 -c:v libvpx-vp9 -b:v 400K -pass 1 -an -f webm NUL && ^
ffmpeg -i subway_line_7_loop.mp4 -c:v libvpx-vp9 -b:v 400K -pass 2 -c:a libopus subway_line_7_loop.webm

More documentation

About

Sonification of Income Inequality on the NYC Subway

License:MIT License


Languages

Language:HTML 85.1%Language:Python 13.8%Language:JavaScript 0.6%Language:CSS 0.4%