Number of test samples : 1

Question

Number of test samples : 1

alexnix opened this issue 4 years ago · comments

I use a train (12k rows) and a test (4k rows) file.
I read them like this:

paths = ["Fields_train.csv", "Fields_test.csv"]
target_name = "price"
rd = Reader(sep = ',')
df = rd.train_test_split(paths, target_name)

But the output says there is only one test sample....
This causes: "Only one class present in y_true. ROC AUC score is not defined in that case." when doing

dft = Drift_thresholder()
df = dft.fit_transform(df)

PS: I am experimenting with a regression task...

Axel · Answer 1 · Mon May 04 2020 16:21:09 GMT+0800 (China Standard Time)

Hello @alexnix,
The test set is detected as the set where the target is missing (either no column or the values are missing...). If you still want to predict on this specific set, you have to remove the "price" feature from it. Hope it helps !

Alexandru Niculae · Answer 2 · Sun May 10 2020 18:28:21 GMT+0800 (China Standard Time)

Indeed this was the issue, it worked when I removed the price column from the test data set. Thanks!