MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia

Home Page:https://astroautomata.com/PySR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature] PySRSequenceRegressor

MilesCranmer opened this issue · comments

To define a symbolic regression model predict recurrent relations in sequences, I want to create a PySRSequenceRegressor (see #88 for scikit-learn API) that will automatically set this up for the user given a sequence X.

This would require basically no core modifications. It's just a preprocessing step. This could also be used to model any sort of fixed step size time series data, including differential equations (although the learned symbolic model would just be single-step prediction; not rollouts), so I think this would be a nice addition.

The required arguments for this would be: history_length which tells the data preprocessing how many historical features to allocate a single datapoint for it to predict the next step.

The user would be allowed to pass a 1D array (single sequence) or a 3D array (batch of sequences with multiple features each). This will not allow for 2D arrays as input since this is ambiguous with regard to batching vs multi-feature. Passing a 2D array will raise an error telling the user to pass a 3D array, and give the expected axis configuration.

A 3D array as input will raise a warning letting the user know which axis is being interpreted as batch/feature. This warning could be silenced with a flag.

The user could also use PySRRegressor and do the preprocessing themselves.

(cc @patrick-kidger @kazewong, in case of interest)