skywalker023 / weather_metro_analysis

🌦 With KMA weather dataset & πŸšƒ Seoul metro dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Weather & Metro Analysis

With KMA weather dataset & Seoul metro dataset

Visualization

Sinchon station

sinchon

Jamsil station

jamsil

Gangnam station

gangnam

Displaying a very fixed pattern. Not that much variation.

Variables

  • 17:00~23:00 number of passenger on/off board
  • Mean temperature, humidity, rainfall at night

Target

  • Usage of Sinchon station's last subway

Model

  • Gradient boost
  • Random forest

Result

  • RMSE: 351

Feature Importance

feature importance

As you can see, the incoming population between 7 P.M. and 8 P.M. has the highest importance when estimating the last train usage.

Clustering by boarding pattern

by "Getting on" pattern

on-board

by "Getting off" pattern

off-board

Discussion

  • Can discriminate subway stations that are located in hot places, just by "Getting on/off" pattern
  • Classification might be possible: hot / not-hot
    • Further application need to be made to 5678 subway lines
  • Further research might be fun: the gray area btw hot & not-hot areas

About

🌦 With KMA weather dataset & πŸšƒ Seoul metro dataset


Languages

Language:Jupyter Notebook 100.0%