marija-iloska / tpls_python

Transdimensional Predictive Least Squares for Online Feature Selection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transdimensional Predictive Least Squares (TPLS)

Note: MATLAB implementation (including reproducible figures) at tpls_matlab.

This code is a Python implementation of the algorithm TPLS proposed in our paper "Transdimensional Model Learning with Online Feature Selection based on Predictive Least Squares". We provide an example code on how a user can run TPLS, as well as pre-coded feature bar plots. The code to reproduce the experiments presented in our paper is only available in MATLAB.

Introduction

TPLS is a distribution-free online feature selection algorithm that is completely based on least squares (LS). With new data arrivals, TPLS recursively updates the parameter estimate not only in value, but in dimension as well. What makes TPLS unique is the ability to recursively move up and down model dimension, and the fact that it uses the predictive (instead of the fitting) error as its criterium whether to add or remove features. Specifically, the foundations of TPLS are recursive LS (RLS), order recursive LS (ORLS) and predictive LS (PLS).

How to Use Code

How to run TPLS

Jupyter Notebook to run:
example_code.ipynb - a script that demonstrates how to call and run TPLS. It includes feature bar plots and predictive error. For the interested user, we demo how to compute the MSE regret analysis calculations and plots.

About the Code

LS_updates.py

A module which contains all LS related updates, including MSE and predictive error calculations.

Class RLS - updates the model recursively with new data point.
Attributes: ascend()

Class ORLS - updates the model recursively in dimension (up/down --> add/remove feature) based on user's choice.
Attributes: ascend(), descend()

Class PredError - computes the predictive error for the present model.
Attributes: compute()

Class Expectations - computes the MSE differnce between true model and neighbor model (up/down) for all given features. It can compute single time instant MSE and batch.
Attributes: model_up(), model_down(), batch()

algorithms.py

A module of two algorithms:
Class ModelJump - From the present model, loops over all features (add or remove) and computes the predictive error for each proposed model.
Attributes: up(), down(), stay()

Class TPLS - Implements the final JPLS algorithm for one time step, using ModelJump and all in LS_updates.py. Attributes: model_update(), time_update()

util.py

Helper functions for generating synthetic data, initialization, bar plotting, extracting feature indices, and finding min index.
Attributes: generate_data(), initialize(), bar_plot(), get_features(), get_min().

About

Transdimensional Predictive Least Squares for Online Feature Selection


Languages

Language:Jupyter Notebook 86.2%Language:Python 13.8%