datafold-dev / paper_modeling_melburnians

Modeling Melburnians - Using the Koopman operator to gain insight into crowd dynamic (Supplementary Material)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Supplementary material: Modeling Melburnians - Using the Koopman operator to gain insight into crowd dynamic

Code and data source

The core model implementations are performed with our own Python package datafold

The primary source of the data included in this repository is provided by the city of Melbourne:

The data is licensed under the Creative Commons Attribution 4.0 International Public License. For details see: https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by/4.0/legalcode

Run notebook

Before running notebook.ipynb, note that the computations can take a while, depending on your hardware. A minimum of 16 GiB memory is required. However, if intermediate results are cached in csv files and the flag use_cache is enabled in the notebook, most computationally expensive operations are avoided.

The notebook.ipynb reproduces and visualizes results from the paper. The file main.py includes the code for training the EDMD model and plotting.

Jupyter notebook

To execute the code Python>=3.7 and Jupyter is required.

To install the package requirements run:

pip install datafold==1.1.4 holidays==0.10.4

Open the Jupyter notebook with

jupyter notebook notebook.ipynb

Overview of data files:

Raw data:

  • X_all.csv -- raw data of the original source with all sensors between 1/1/2016 - 12/31/2019 with sampling rate of 1 hour

Cached data:

Note, that because of size restrictions in the official supplementary material of the paper, the cached files are only included in the github repository (see link above). All cache files are generated locally, if the notebook executes runs with use_cache=False.

  • X_selected.csv
    selected sensors and samples from X_all.csv as highlighted in the paper
  • X_windows_[train|test].csv
    data of time series windows of length 193; these are used for the time series pairs with 169 hours (initial condition) + 24 hours (prediction)
  • X_reconstruct_[train|test].csv
    reconstructed data in time series of length 24 for each window in corresponding X_windows
  • X_latent_[train|test].csv
    diffusion map values for each prediction window
  • X_latent_interp_test.csv
    interpolated diffusion map values with EDMD model
  • X_eigfunc_test.csv
    complex Koopman eigenfunction values

About

Modeling Melburnians - Using the Koopman operator to gain insight into crowd dynamic (Supplementary Material)


Languages

Language:Jupyter Notebook 97.8%Language:Python 2.2%