Oliver-BE / transit-gdp-project

An exploration of predicting GDP per capita from transit statistics with an interactive Shiny app and R package containing our full dataset (aggregation of 30 different datasets).

Home Page:https://jonah.shinyapps.io/test/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transit GDP Project

An exploration of predicting GDP per capita from transit statistics with an interactive Shiny app and R package containing our full dataset (aggregation of 30 different datasets)

Shiny App

Our Shiny app can be found at the following link: https://jonah.shinyapps.io/test/

R Package dataset

The dataset transit_qol_dfcan be found in the gluskr package we created for this project:

Variable Definitions

msa_id Msa_id is a numeric code that uniquely identifies all metropolitan and micropolitan statistical geographic areas for which the Census Bureau tabulates data. This is identical to the census' 'GEOid.2'.

msa The name of the area associated with a particular metropolitan or micropolitan statistical geographic areas.

pop_estimate_msa The estimated population for the area delineated by a specific msa_id.

year The year for which the data associated in that row was collected.

median_household_income_msa Median household income is recorded in dollars. It refers to the income and benefits for a household in inflation-adjusted dollars. These data came from a table for each year, so the value was calculated in inflation-adjusted dollars for any given year.

percent_workers_commuting_by_public_transit_msa A percentage of the number of workers who commuted by public transportation (which excludes taxicabs) divided by the total number of workers 16 years and older. Workers include members of the Armed Forces and civilians who were at work the week before the survey was given out.

percent_unemployed_msa A percentage of the number of unemployed individuals in the civilian labor force who are counted in the population 16 years and older.

percent_no_insurance_msa Percentage of the non institutionalized civilian population without health insurance coverage.

percent_below_poverty_level For the years 2010-2017, this variable exists. This is the percentage of all people in the United States who have income below the poverty level.

gdp_msa A measure of gross domestic product across all industries in millions of current dollars.

intptlat Internal point latitude for each specific metropolitan and micropolitan statistical entity.

intptlon Internal point longitude for each specific metropolitan and micropolitan statistical entity.

ua_census_id The census' unique id code for every urbanized area in the US.

ua_sq_miles_2010 The size of the urbanized area in 2010

ua_fta_name The urbanized area's name.

transit_modes The mode of transit categorized as rail, non-rail bus, and non-rail other. The following modes of transit are grouped into rail: Alaska Railroad (AR), Cable Car (CC), Commuter Rail (CR), Heavy Rail (HR), Hybrid Rail (YR), Inclined Plane (IP), Light Rail (LR), Monorail/Automated Guideway (MG), Streetcar Rail (SR). The following modes of transit are grouped into non-rail bus: Commuter Bus (CB), Bus (MB), Bus Rapid Transit (RB), Jitney (JT), Público (PB), Trolleybus (TB). The following modes of transit are grouped into non-rail other: Ferryboat (FB), Aerial Tramway (TR), Vanpool (VP), Demand Response (DR), Demand Response – Taxi (DT).

total_transit_expenses Total expenses for all transit agencies in the urbanized area.

total_fares Total fares collected by all transit agencies in the urbanized area.

directional_route_mile Defined by the FTA as: "The mileage in each direction over which public transportation vehicles travel while in revenue service."

vehicle_hours Also known as Vehicle Revenue Hours. It is defined by the FTA as: "The hours that vehicles are scheduled to or actually travel while in revenue service."

vehicle_miles
Also known as Vehicle Revenue Miles. It is defined by the FTA as: "The miles that vehicles are scheduled to or actually travel while in revenue service.""

passenger_miles Defined by the FTA as: "The cumulative sum of the distances ridden by each passenger."

passenger_trips
Also known as Unlinked Passenger Trips. It is defined by the FTA as: "The number of passengers who board public transportation vehicles. Passengers are counted each time they board vehicles no matter how many vehicles they use to travel from their origin to their destination."

total_stations_2017 The total number of stations within the given urbanized area in 2017.

total_funding
Total funding of transit agencies in the given urbanized area for that year. It is the sum of federal_funding, state_funding, local_funding, other_funding.

federal_funding Funding for the transit agencies in the given urbanized area from the federal government.

state_funding
Funding for the transit agencies in the given urbanized area from the relevant state government.

local_funding
Funding for the transit agencies in the given urbanized area from the relevant local government.

other_funding Funding for the transit agencies in the given urbanized area that is derived from other sources. This can include fares, advertising, parking fees, etc.

per_capita_gdp The gdp_msa divided by pop_estimate_msa to get a gdp per capita statistic.

pmt_per_vrm A metric. Defined as passenger_miles divided by vehicle_miles.

pmt_per_vrh A metric. Defined as passenger_miles divided by vehicle_hours.

upt_per_vrh A metric. Defined as passenger_trips divided by vehicle_hours.

per_capita_vrm A metric. Defined as vehicle_miles divided by pop_estimate_msa.

per_capita_vrh A metric. Defined as vehicle_hours divided by pop_estimate_msa.

per_capita_pmt A metric. Defined as passenger_miles divided by pop_estimate_msa.

per_capita_upt A metric. Defined as passenger_trips divided by pop_estimate_msa.

recovery_ratio A metric. Defined as total_fares divided by total_transit_expenses. Almost always below 1 as each individual fares doesn't cover the whole cost of public transit. Essentially measures the percent loss a transit agency takes per passenger trip.

fares_per_upt A metric. Defined as total_fares divided by passenger_trips.

cost_per_hour A metric. Defined as total_transit_expenses divided by vehicle_hours.

cost_per_trip A metric. Defined as total_transit_expenses divided by passenger_trips.

cost_per_pmt A metric. Defined as total_transit_expenses divided by passenger_miles.

Notes on data

Internal Points The Census Bureau calculates an internal point (latitude and longitude coordinates) for each geographic entity. For many geographic entities, the internal point is at or near the geographic center of the entity. For some irregularly shaped entities (such as those shaped like a crescent), the calculated geographic center may be located outside the boundaries of the entity. In such instances, the internal point is identified as a point inside the entity boundaries nearest to the calculated geographic center and, if possible, within a land polygon. (https://www.census.gov/geo/reference/gtc/gtc_area_attr.html).

Metrics The Federal Transit Administration uses all these metrics. The FTA's Small Transit Intensive Cities (STIC) Formula to determine funding allocation uses the following metrics: 1. Passenger miles traveled per vehicle revenue mile, 2. Passenger miles traveled per vehicle revenue hour 3. Vehicle revenue miles per capita, 4.Vehicle revenue hours per capita, 5.Passenger miles traveled per capita, and 6.Passengers per capita.

The FTA's file called "Metrics" includes "Fare Revenues per Unlinked Passenger Trip", "Fare Revenues per Total Operating Expense (Recovery Ratio)", "Cost per Hour", "Passengers per Hour", "Cost per Passenger", and "Cost per Passenger Mile." That was how we decided to use these statistics as metrics.

Reproducibility All the steps necessary to reproduce our results can be found by knitting our Technical-Report.Rmd.

Sources

The variables median_household_income_msa, percent_workers_commuting_by_public_transit_msa, percent_unemployed_msa, percent_no_insurance_msa, percent_no_insurance_msa and percent_below_poverty_level were obtained from:

The United States Census Bureau, American Fact Finder website Table ID = DP03 Table Title = "Selected Economic Characteristics" Data Set = "ACS (American Community Survey) 1-year estimates" from years 2007-2017

The variable gdp_msa was obtained from:

The Bureau of Economic Analysis, U.S. Department of Commerce Table: GDP and Personal Income/Gross Domestic Product (GDP) Area/Statistic: Area: United States (Metropolitan Portion) Statistic: All industry total Unit of Measure: Levels Period: 2007-2017

The variables ua_census_id, ua_pop_2010, ua_sq_miles_2010, ua_fta_name, transit_modes, passenger_trips, vehicle_miles ,vehicle_hours, total_stations_2017, total_funding, state_funding, local_funding, other_funding were obtained from:

The National Transportation Database, Federal Transit Administration Period: 2007-2017

The variables intptlat, intptlon were obtained from:

The United Census Bureau, TigerWeb Tab: Nation-Based DataFiles Tables: Metropolitan Statistical Areas - Census 2010, Micropolitan Statistical Areas - Census 2010

Authors

Oliver Baldwin Edwards, Nicole Frontero, and Martin Glusker 2018

Special thanks to Alex Baldenko, Kendall Codey, and Jonah Gilbert for their invaluable help on this project.

About

An exploration of predicting GDP per capita from transit statistics with an interactive Shiny app and R package containing our full dataset (aggregation of 30 different datasets).

https://jonah.shinyapps.io/test/


Languages

Language:R 100.0%