MRT Delay Data

MRT Delay Data

Hi fellow redditors ! We are two fellow r/sg redditors ( u/Sproinkerino & u/captmomo) wanted to work on a mini-project to improve our programming/analytics skills. We would like to share with you guys the work that we have done! We are pretty much new to this so any feedback is pretty much appreciated.

1) Data Scrapping from SMRT Twitter

Using python, we scraped SMRT's twitter feed to gather the delay/breakdown data in the past 5 years. The data is then cleaned to obtain the delays for specific lines and stations.

2) Data Visualization with R

Next we used ggplot2/ggmaps in R to obtain a map plot of all the different stations along with the number of delays.

3) Here is the plot: https://imgur.com/a/b7D728p

The darker the node the more delays the station had experienced.

4) Data Summary

Station with most number of breakdown NS16 Ang Mo Kio
Line with most number of breakdowns North South Line
Line with least number of breakdowns Down Town Line

5) Mini-Webapp to find probability of a delay

We also created a fun little webapp to find the probability of delay when travelling from 1 station to another ! Link : https://mrt-breakdown.herokuapp.com/

We hope this mini-project was interesting to you guys ! Any feedback will be greatly appreciated ! :)


