liyongh1 / EDA-of-TTC-Delays

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploratory Data Analysis on TTC Transit Delay Data - Potential Correlations Between Specific Time Periods and Delay Frequencies

author: Zhaotian Li, Yonghao Li, Hongtianxu Hua

date: 8 Feb, 2020

This paper is concerned with the potential effects of different times of day/week/year on the delay frequencies of the TTC transit system, including subway, buses, and streetcars. In particular, it uses exploratory data analytic methods (summary statistics, grouping, and plotting) on the TTC Delay Datasets (from Open Data Toronto) to clearly show that the number of delays peaks in the morning and afternoon rush hours. More importantly, we’ve found that “Injured or ill Customer”, “Speed Control”, and “Passenger Assistance Alarm Activated - No Trouble Found” are the most influential reasons for delay occurrences. Our initial hypothesis was that the frequency of daily/weekly/monthly delays should remain relatively consistent across time. However, by doing exploratory data analysis, we found that delay occurrences correlate with specific time periods. A predictive model built from this EDA might be able to predict the frequency of delays by temporal factors. This paper will give insights to the factors contributing to TTC delays and might be of interest to the authorities and management in their efforts to improve transit efficiency.

About