garaud / jitenshea

Bicycle-sharing data analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add short-term predictions into Luigi pipeline

delhomer opened this issue · comments

A seminal work has been done in an alternative Github project. It could be interesting to integrate it to the Luigi data pipeline built here.

As input :

  • timeserie table, for any city

As output :

  • short_term_predictions table with columns station_id, nb_bikes, nb_stands, ts ; stored in the corresponding city scheme.

The timestamp column will indicate the date and hour of the prediction, in a such manner that new predictions could be added to the table on-the-fly (instead of dropping old predictions when new ones are made). As a consequence, the output table primary key is given by columns (station_id, ts).

This idea is a first draw, we could integrate some more information to the table and make appear true values when they become available, and concurrently, error measures (absolute, relative, quadratic, and so on...).

The PR #9 added the model training step. It still misses the inference step, i.e. to predict effectively the bike availability.

The PR #32 should solve the issue. As long as it is merged into master, this issue will have to be closed.