AxeldeRomblay / MLBox

MLBox is a powerful Automated Machine Learning python library.

Home Page:https://mlbox.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Number of test samples : 1

alexnix opened this issue · comments

I use a train (12k rows) and a test (4k rows) file.
I read them like this:

paths = ["Fields_train.csv", "Fields_test.csv"]
target_name = "price"
rd = Reader(sep = ',')
df = rd.train_test_split(paths, target_name)

But the output says there is only one test sample....
This causes: "Only one class present in y_true. ROC AUC score is not defined in that case." when doing

dft = Drift_thresholder()
df = dft.fit_transform(df)

PS: I am experimenting with a regression task...

commented

Hello @alexnix,
The test set is detected as the set where the target is missing (either no column or the values are missing...). If you still want to predict on this specific set, you have to remove the "price" feature from it. Hope it helps !

Indeed this was the issue, it worked when I removed the price column from the test data set. Thanks!