MRT Delay Data
Hi fellow redditors ! We are two fellow r/sg redditors ( u/Sproinkerino & u/captmomo) wanted to work on a mini-project to improve our programming/analytics skills. We would like to share with you guys the work that we have done! We are pretty much new to this so any feedback is pretty much appreciated.
1) Data Scrapping from SMRT Twitter
Using python, we scraped SMRT's twitter feed to gather the delay/breakdown data in the past 5 years. The data is then cleaned to obtain the delays for specific lines and stations.
2) Data Visualization with R
Next we used ggplot2/ggmaps in R to obtain a map plot of all the different stations along with the number of delays.
The darker the node the more delays the station had experienced.
4) Data Summary
|Station with most number of breakdown||NS16 Ang Mo Kio|
|Line with most number of breakdowns||North South Line|
|Line with least number of breakdowns||Down Town Line|
5) Mini-Webapp to find probability of a delay
We also created a fun little webapp to find the probability of delay when travelling from 1 station to another ! Link : https://mrt-breakdown.herokuapp.com/
We hope this mini-project was interesting to you guys ! Any feedback will be greatly appreciated ! :)