DeepFM - singleton array issue
duncanmcelfresh opened this issue · comments
occurs with dataset: openml__sulfur__360966
traceback:
Traceback (most recent call last):
File "/home/shared/tabzilla/TabSurvey/tabzilla_experiment.py", line 137, in __call__
result = cross_validation(model, self.dataset, self.time_limit)
File "/home/shared/tabzilla/TabSurvey/tabzilla_utils.py", line 236, in cross_validation
loss_history, val_loss_history = curr_model.fit(
File "/home/shared/tabzilla/TabSurvey/models/deepfm.py", line 89, in fit
loss_history, val_loss_history = self.model.fit(
File "/home/shared/tabzilla/TabSurvey/models/deepfm_lib/models/basemodel.py", line 252, in fit
train_result[name].append(metric_fun(
File "/opt/conda/envs/torch/lib/python3.10/site-packages/sklearn/metrics/_regression.py", line 442, in mean_squared_error
y_type, y_true, y_pred, multioutput = _check_reg_targets(
File "/opt/conda/envs/torch/lib/python3.10/site-packages/sklearn/metrics/_regression.py", line 100, in _check_reg_targets
check_consistent_length(y_true, y_pred)
File "/opt/conda/envs/torch/lib/python3.10/site-packages/sklearn/utils/validation.py", line 384, in check_consistent_length
lengths = [_num_samples(X) for X in arrays if X is not None]
File "/opt/conda/envs/torch/lib/python3.10/site-packages/sklearn/utils/validation.py", line 384, in <listcomp>
lengths = [_num_samples(X) for X in arrays if X is not None]
File "/opt/conda/envs/torch/lib/python3.10/site-packages/sklearn/utils/validation.py", line 325, in _num_samples
raise TypeError(
TypeError: Singleton array array(0.0816204) cannot be considered a valid collection.
fyi - it looks like most of the DeepFM errors are from this bug, so fixing this should get us a lot more results for DeepFM
Looks like this line was causing an issue (in basemodel.py for DeepFM)
y_pred = model(x).squeeze()
If the y_pred only has length 1, squeeze will return an empty array, which is causing the mismatch at loss and metric calculations. I think the following should fix it
if len(y_pred)!=1: y_pred = y_pred.squeeze() y = y.squeeze()
This will raise the following warning, but I think it should be okay as we are ensuring that size is correct
UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.