Muhammad4hmed / GML

Auto Data Science - Python Library.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to pass data into AUTOML

prashanthGit945 opened this issue · comments

from GML import AutoML

gml_ml = AutoML()
gml_ml.GMLRegressor(X, y, metric =mean_squared_error, folds =10)


KeyError Traceback (most recent call last)
in
2
3 gml_ml = AutoML()
----> 4 gml_ml.GMLRegressor(X, y, metric =mean_squared_error, folds =5)

~\Anaconda3\lib\site-packages\GML\ML.py in GMLRegressor(self, X, y, metric, folds)
199 for i,model in enumerate(self.reg_models):
200 name = str(model.class.name)
--> 201 scores = self.cross_val(X, y, model, metric, folds)
202
203 print('{} got score of {} in {} folds'.format(model.class.name,scores,folds))

~\Anaconda3\lib\site-packages\GML\ML.py in cross_val(self, X, y, model, metric, folds)
165 for tr_in, val_in in KFold(n_splits = folds).split(X, y):
166 model_fold = model
--> 167 X_train, y_train, X_val, y_val = X.iloc[tr_in,:], y[tr_in], X.iloc[val_in,:], y[val_in]
168 model_fold.fit(X_train, y_train)
169 y_hat = model.predict(X_val)

~\Anaconda3\lib\site-packages\pandas\core\series.py in getitem(self, key)
904 return self._get_values(key)
905
--> 906 return self._get_with(key)
907
908 def _get_with(self, key):

~\Anaconda3\lib\site-packages\pandas\core\series.py in _get_with(self, key)
939 # (i.e. self.iloc) or label-based (i.e. self.loc)
940 if not self.index._should_fallback_to_positional():
--> 941 return self.loc[key]
942 else:
943 return self.iloc[key]

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in getitem(self, key)
877
878 maybe_callable = com.apply_if_callable(key, self.obj)
--> 879 return self._getitem_axis(maybe_callable, axis=axis)
880
881 def _is_scalar_access(self, key: Tuple):

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1097 raise ValueError("Cannot index with multidimensional key")
1098
-> 1099 return self._getitem_iterable(key, axis=axis)
1100
1101 # nested tuple slicing

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_iterable(self, key, axis)
1035
1036 # A collection of keys
-> 1037 keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
1038 return self.obj._reindex_with_indexers(
1039 {axis: [keyarr, indexer]}, copy=True, allow_dups=True

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
1252 keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
1253
-> 1254 self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
1255 return keyarr, indexer
1256

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
1313
1314 with option_context("display.max_seq_items", 10, "display.width", 80):
-> 1315 raise KeyError(
1316 "Passing list-likes to .loc or [] with any missing labels "
1317 "is no longer supported. "

KeyError: "Passing list-likes to .loc or [] with any missing labels is no longer supported. The following labels were missing: Int64Index([1204, 1211, 1219, 1239, 1243,\n ...\n 5861, 5868, 5881, 5904, 5935],\n dtype='int64', length=312). See https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike"

Please make sure X and y both are data frames, if not, you can make it.

import pandas as pd

X = pd.DataFrame(X)
y = pd.DataFrame(y)

after feature engineering i collected X_new, y, test = fe.get_new_data()
and passed X= X_new and y=y then I got an error and I checked X_new, y are data frames and I tried reset _index also

try

y = np.array(y)

if dataframe on y is not working

ok i got it and how to predict on test data (unseen data)

GML provides you the information that which model is performing well, it expects you to import a model from sklearn and train it by yourself after testing all the models in GML.
but will add this functionality in the next update. sorry for inconvenience