AutoML: CI trips with `ValueError: Input contains NaN.`

Question

AutoML: CI trips with `ValueError: Input contains NaN.`

amotl opened this issue 5 months ago · comments

Originally coming from an issue that mixed things up, GH-170, let's get things straight here.

Problem

CI on the AutoML job occasionally trips like this, failing the CI run.

FAILED test.py::test_file[automl_timeseries_forecasting_with_pycaret.py] - ValueError: Input contains NaN.

self = <joblib.parallel.BatchCompletionCallBack object at 0x7f4f737cb910>

    def _return_or_raise(self):
        try:
            if self.status == TASK_ERROR:
>               raise self._result
E               ValueError: Input contains NaN.

-- https://github.com/crate/cratedb-examples/actions/runs/7884792002/job/21514554253#step:6:1146

Outlook

@andnig shared his suggestions at #170 (comment) already. Maybe you can add them here instead?

Andreas Motl · Answer 1 · Wed Feb 14 2024 06:33:04 GMT+0800 (China Standard Time)

Recommendation

@andnig suggested:

To go forward, you could use a different model for the test run, one which has less MASE.

#170 (comment)

Thanks!

Rationale

If I look at the failed run, I see the the esm model has an incredibly high MASE and RMSSE. This mostly indicates that the model is not very well suited for the data. I suggested it, as it is very lightweight, but well, too lightweight as it seems 😓

Andreas Motl · Answer 2 · Wed Feb 14 2024 16:58:11 GMT+0800 (China Standard Time)

Hi again. GH-300 makes it so to exclusively use a single model, "ets_cds_dt". Unfortunately, it still trips on CI.

Andreas Nigg · Answer 3 · Wed Feb 14 2024 22:20:40 GMT+0800 (China Standard Time)

Wasn't the script about using 3 models? I think the later benchmarking operations need at least 3 models, don't they?
Using 1 model without adjusting the later call will probably cause the trainers to fail.
But you'd also see this locally, not only on CI.

Andreas Motl · Answer 4 · Thu Feb 15 2024 05:34:37 GMT+0800 (China Standard Time)

Ah all right. That looks like I didn't know what I was doing at all. Thanks!

Andreas Motl · Answer 5 · Tue Feb 27 2024 20:14:40 GMT+0800 (China Standard Time)

Currently, we see no problems on CI in this regard. Therefore, I am closing the issue. Thanks for your support, @andnig!

Andreas Motl · Answer 6 · Fri Apr 12 2024 05:38:25 GMT+0800 (China Standard Time)

The problem still happens occasionally, so re-opening.

-- https://github.com/crate/cratedb-examples/actions/runs/8644976949/job/23701224841#step:6:1137

Andreas Motl · Answer 7 · Mon Apr 15 2024 01:27:51 GMT+0800 (China Standard Time)

Happened again on the nightly job run.
-- https://github.com/crate/cratedb-examples/actions/runs/8679470840/job/23798230063#step:6:1176

And again on a PR.
-- https://github.com/crate/cratedb-examples/actions/runs/8744668161/job/23998070626?pr=425#step:6:1203