AutoViML / Auto_TS

Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Created by Ram Seshadri. Collaborators welcome.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

building ML model with at least 1 order of differencing fails at test_stationarity

ghanasyam-rallapalli opened this issue · comments

ts_df[[target]] here needs to be either a Series

diff_limit = test_stationarity(ts_df[[target]], plot=False, verbose=True, var_only=False)

or this df needs to be converted to array in below block
else:
### In non-VAR models you need to test only the target variable for stationarity ##
timeseries = copy.deepcopy(time_df)
dftest = smt.adfuller(timeseries, maxlag=maxlag, regression=regression, autolag=autolag)
dfoutput = pd.Series(dftest[0:4], index=['Test Statistic',
'p-value',
'#Lags Used',
'Number of Observations Used',
],name='Dickey-Fuller Augmented Test')
for key, value in dftest[4].items():
dfoutput['Critical Value (%s)' % key] = value
if verbose:
print('Results of Augmented Dickey-Fuller Test:')
pretty_print_table(dfoutput)
if dftest[1] >= alpha:
print(' this series is non-stationary. Trying test again after differencing...')
timeseries = pd.Series(timeseries).diff(1).dropna().values

as following throws error as it getting a dataframe not a series or an array

timeseries = pd.Series(timeseries).diff(1).dropna().values

auto_ts\__init__.py:354, in auto_timeseries.fit(self, traindata, ts_column, target, sep, cv)
    351         print('There is no differencing needed in this datasets for VAR model')
    352 else:
    353     ### If it is not VAR, you need to test only target var for stationarity!
--> 354     diff_limit = test_stationarity(ts_df[[target]], plot=False, verbose=True, var_only=False)
    355     if diff_limit:
    356         print('There is some differencing needed in this datasets for stat models')

auto_ts\utils\eda.py:293, in test_stationarity(time_df, maxlag, regression, autolag, window, plot, verbose, var_only)
    291 if dftest[1] >= alpha:
    292     print(' this series is non-stationary. Trying test again after differencing...')
--> 293     timeseries = pd.Series(timeseries).diff(1).dropna().values
    294     dftest = smt.adfuller(timeseries, maxlag=maxlag, regression=regression, autolag=autolag)
    295     dfoutput = pd.Series(dftest[0:4], index=['Test Statistic',
    296                                              'p-value',
    297                                              '#Lags Used',
    298                                              'Number of Observations Used',
    299                                              ],name='Dickey-Fuller Augmented Test')

pandas\core\series.py:367, in Series.__init__(self, data, index, dtype, name, copy, fastpath)
    363 else:
    365     name = ibase.maybe_extract_name(name, data, type(self))
--> 367     if is_empty_data(data) and dtype is None:
    368         # gh-17261
    369         warnings.warn(
    370             "The default dtype for empty Series will be 'object' instead "
    371             "of 'float64' in a future version. Specify a dtype explicitly "
   (...)
    374             stacklevel=find_stack_level(),
    375         )
    376         # uncomment the line below when removing the FutureWarning
    377         # dtype = np.dtype(object)

pandas\core\construction.py:818, in is_empty_data(data)
    816 is_none = data is None
    817 is_list_like_without_dtype = is_list_like(data) and not hasattr(data, "dtype")
--> 818 is_simple_empty = is_list_like_without_dtype and not data
    819 return is_none or is_simple_empty

pandas\core\generic.py:1527, in NDFrame.__nonzero__(self)
   1525 @final
   1526 def __nonzero__(self):
-> 1527     raise ValueError(
   1528         f"The truth value of a {type(self).__name__} is ambiguous. "
   1529         "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1530     )

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

how to reproduce just build ML model that needs at least 1 order of differencing

 model_type=['ML']

this is a minor fix, so probably not worth a pull request on it's own

Ok agreed. Please see the updated version 0.66 which you can upgrade via:

pip install git+git://github.com/AutoViML/Auto_TS