ValueError:The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
NiDHanWang opened this issue · comments
Hi there! Thanks so much for such good piece of work, it really helps!
But recently an error raised when I use identify_zero_importance
. It worked well when I turn off the early_stopping
, and error raised when I turn it on.
Here's my code:
from feature_selector import FeatureSelector select_label=train_fill['SalePrice'] select_featrue=train_fill.drop(columns=['SalePrice','Id']) fs=FeatureSelector(data=select_featrue,labels=select_label) fs.identify_zero_importance(task='regression',eval_metric='L2',n_iterations=10,early_stopping=True)
Here's the error:
ValueError Traceback (most recent call last)
in
----> 1 fs.identify_zero_importance(task='regression',eval_metric='L2',n_iterations=10,early_stopping=True)
D:\anaconda\lib\site-packages\feature_selector.py in identify_zero_importance(self, task, eval_metric, n_iterations, early_stopping)
304 if early_stopping:
305
--> 306 train_features, valid_features, train_labels, valid_labels = train_test_split(features, labels, test_size = 0.15, stratify=labels)
307
308 # Train the model with early stopping
D:\anaconda\lib\site-packages\sklearn\model_selection_split.py in train_test_split(*arrays, **options)
2119 random_state=random_state)
2120
-> 2121 train, test = next(cv.split(X=arrays[0], y=stratify))
2122
2123 return list(chain.from_iterable((safe_indexing(a, train),
D:\anaconda\lib\site-packages\sklearn\model_selection_split.py in split(self, X, y, groups)
1321 """
1322 X, y, groups = indexable(X, y, groups)
-> 1323 for train, test in self._iter_indices(X, y, groups):
1324 yield train, test
1325
D:\anaconda\lib\site-packages\sklearn\model_selection_split.py in _iter_indices(self, X, y, groups)
1634 class_counts = np.bincount(y_indices)
1635 if np.min(class_counts) < 2:
-> 1636 raise ValueError("The least populated class in y has only 1"
1637 " member, which is too few. The minimum"
1638 " number of groups for any class cannot"
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2
It seems like the something goes wrong whenn it try to split the data into train&test in line 306? How can i fix it?
I solve this problem by removing the argument stratify in function train_test_split
at the line 306.
same issue here. Tried all alternatives in task= and eval_metric=... > always same error when early_stopping is set to True.
Also tried to provide Y in different formats (array, pandas dataframe, pandas series) -> same error.
@DeckerDai: removing argument stratify did not solve it for me.
No error when early_stopping=False.
Also not sure why error even comes up given that I'm trying to do a regression problem (task='regression',eval_metric='l2'
I solve this problem by removing the argument stratify in function
train_test_split
at the line 306.
I explore that line and came up with this solution to keep the stratify argument for 'classification', but not for 'regression':
if early_stopping:
if task == 'classification':
train_features, valid_features, train_labels, valid_labels = train_test_split(features, labels, test_size = 0.15, stratify=labels)
else:
train_features, valid_features, train_labels, valid_labels = train_test_split(features, labels, test_size = 0.15)