MilesCranmer / PySR

High-Performance Symbolic Regression in Python and Julia

Home Page:https://astroautomata.com/PySR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] *not working if y is pd.DataFrame*

butter517cup opened this issue · comments

It works when X is pd.DataFrame.
However, if y is pd.DataFrame, it gives the following error (it works if y is np.array though):

Warning (from warnings module):
File "C:\Program Files\Python39\lib\site-packages\pysr\sr.py", line 942
warnings.warn("Resetting variable_names from X.columns")
UserWarning: Resetting variable_names from X.columns
Traceback (most recent call last):
File "C:\Users\butter\Documents\pySR_test.py", line 41, in
model.fit(X, y)
File "C:\Program Files\Python39\lib\site-packages\pysr\sr.py", line 808, in fit
self._run(
File "C:\Program Files\Python39\lib\site-packages\pysr\sr.py", line 953, in _run
assert not isinstance(y, pd.DataFrame)
AssertionError

If y is pd.Series it gives following error:
Traceback (most recent call last):
File "C:\Users\butter\Documents\pySR_test.py", line 41, in
model.fit(X, y)
File "C:\Program Files\Python39\lib\site-packages\pysr\sr.py", line 808, in fit
self._run(
File "C:\Program Files\Python39\lib\site-packages\pysr\sr.py", line 985, in _run
y = y.reshape(-1)
File "C:\Program Files\Python39\lib\site-packages\pandas\core\generic.py", line 5583, in getattr
return object.getattribute(self, name)
AttributeError: 'Series' object has no attribute 'reshape'

I think it would be fine to work with y too DataFrame.

This is actually expected behaviour - hence why there's the AssertionError. However, you raise a good point and I think I should just automatically convert pd.Series into numpy arrays internally, since to the user it seems like if X can be a series, then y should be too. Will close this when it's added.

Fixed with ad8332d. y can be a pd.Series or pd.DataFrame.