Error with Sklearn hyper parameter search with Multivariateclassifier
timlac opened this issue · comments
Description
I'm trying to run the Multivariate Classifier along with TimeSeriesForest within the sklearn Randomized search on hyper parameters.
However I get the following error:
ValueError: Invalid parameter n_estimators for estimator MultivariateClassifier(estimator=TimeSeriesForest()). Check the list of available parameters with
estimator.get_params().keys()
.
Steps/Code to Reproduce
import numpy as np
from pyts.classification import TimeSeriesForest
from sklearn.model_selection import RandomizedSearchCV
from pyts.multivariate.classification import MultivariateClassifier
# some placeholder example data (observations, dimensions, time steps)
X = np.ones((246, 17, 1084))
y = np.ones((246))
n_estimators_values = [int(x) for x in np.linspace(10, 500, num = 100)]
parameters = {'n_estimators': n_estimators_values
}
rf = MultivariateClassifier(TimeSeriesForest())
clf = RandomizedSearchCV(estimator = rf,
param_distributions = parameters,
scoring = 'roc_auc_ovo_weighted',
verbose = 1,
n_iter = 5000,
random_state = 27,
n_jobs = -1
)
clf.fit(X,y)
Versions
Python 3.8.13
NumPy 1.22.3
SciPy 1.7.3
Scikit-Learn 1.0.2
Numba 0.56.0
Pyts 0.12.0
Hi,
I'm very sorry for the delayed response (vacations). Here is a stackoverflow post with a similar setting: a one-vs-rest classifier when one wants to try out several values for the hyperparameters of the base classifier. If you run rf.get_params()
, you get the following dictionary:
>>> rf.get_params()
{'estimator__bootstrap': True,
'estimator__ccp_alpha': 0.0,
'estimator__class_weight': None,
'estimator__criterion': 'entropy',
'estimator__max_depth': None,
'estimator__max_features': 'auto',
'estimator__max_leaf_nodes': None,
'estimator__max_samples': None,
'estimator__min_impurity_decrease': 0.0,
'estimator__min_samples_leaf': 1,
'estimator__min_samples_split': 2,
'estimator__min_weight_fraction_leaf': 0.0,
'estimator__min_window_size': 1,
'estimator__n_estimators': 500,
'estimator__n_jobs': None,
'estimator__n_windows': 1.0,
'estimator__oob_score': False,
'estimator__random_state': None,
'estimator__verbose': 0,
'estimator': TimeSeriesForest(),
'weights': None}
So you can get and set the values of the base estimator by using the prefix estimator__
. In your case, you have to replace
parameters = {'n_estimators': n_estimators_values}
with
parameters = {'estimator__n_estimators': n_estimators_values}
Hope that this answers your issue.
Best,
Johann