[MNT] lack of test coverage of `pandas 2.2.X` and deep learning backends

Question

[MNT] lack of test coverage of `pandas 2.2.X` and deep learning backends

fkiraly opened this issue 2 months ago · comments

As remarked by @yarnabrina on discord, current tests lack coverage of estimators that require deep learning backends, on pandas 2.2.X.

This is due to neural network backends only being present in test-full, which contains packages that imply pandas < 2.2.X.

Conversely, on the short module tests where pandas is not restricted by other dependencies, deep learning backends do not get installed.

How to best solve this, @sktime/core-developers ?

Possible solutions:

add the dl depset in modules job - this would only work if the dl dep set does not imply pandas < 2.2.X
add another batch of test with the cross-condition
estimator specific environments...

Anirban Ray · Answer 1 · Sat Apr 20 2024 02:16:48 GMT+0800 (China Standard Time)

This is due to neural network backends only being present in test-full, which contains packages that imply pandas < 2.2.X.

I may have not explained correctly, but this is not what I meant. What I meant is currently dl extra is only tested in test-full job of old CI, where all_extras_pandas2 and dev are also installed. As a result of this massive set of packages (and their dependencies), it's now resolving to 2.1.4. It may be done by dl packages alone, but that I have not checked myself.

With more packages to be added in dl soon (transformers from Hugging Face PR), this is only going to add more restrictions. I think the only way forward will be the per estimator test idea what @fkiraly proposed earlier (can't find the issue) to ensure full coverage. However that seems really difficult, as Github has restrictions on number of jobs being created from a matrix (256 I think) and it's surely going to hit that limit (definitely considering OS and python versions, but probably on the number of estimators alone).

Another possibility, but have not thought out pros/cons/possibility yet:

run script to identify all modified estimators (current subset logic to identify when to run a test)
identify dependencies of these estimators
create a job matrix of unique dependencies (e.g. statsmodels, neuralforecast, huggingface+chronos, etc.)
in each job, run all the tests on modified estimators across several OS/Python versions

Franz Király · Answer 2 · Sat Apr 20 2024 04:31:04 GMT+0800 (China Standard Time)

I may have not explained correctly, but this is not what I meant

I think we do mean the same thing, I think my formulation was unclear. When I said

"This is due to neural network backends only being present in test-full, which contains packages that imply pandas < 2.2.X."

the "which" was referencing the test-full environment rather than the neural network backends (about which I do not know whether they imply bounds on pandas.

Can you confirm whether you now think that we mean the same, or not?

Franz Király · Answer 3 · Sat Apr 20 2024 04:32:20 GMT+0800 (China Standard Time)

can't find the issue

I had the same problem and was expecting that you would link it 😁

Now, I looked again and this time I used the tag "testing". Good that I keep tagging issues, it was the third one from top:
#5719

Franz Király · Answer 4 · Sat Apr 20 2024 04:34:03 GMT+0800 (China Standard Time)

Another possibility, but have not thought out pros/cons/possibility yet:

How would that work, mechanically?
Do we nead a CLI for test collection?

Either way, with the current logic, we need to do env setup, then run python code, the set up env depending on the output of that, then run tests in that second env.

I would not know how to do this, even though I probably could find out after a few days of research.

Anirban Ray · Answer 5 · Sat Apr 20 2024 21:40:25 GMT+0800 (China Standard Time)

Can you confirm whether you now think that we mean the same, or not?

Yes, I meant same as yours.

How would that work, mechanically?

I am not fully clear either, but the idea depends on the assumption of creating dynamic jobs based on new JSON etc. files. In Gitlab, it's possible to create new children with custom specifications and configurations based on previous steps and their artefacts, so I am hoping Github actions has that too.

If this is indeed possible, rest is pretty straightforward I think. The rest steps will be configured by OS name, python version and soft dependency (may be more than 1) to install and will be very similar to current flow.

@MEMEO-PRO @sammychinedu2ky @Xinyu-Wu-0000 @duydl since all of you helped in CI discussions before, any suggestions if it's feasible or not?

Xinyu Wu · Answer 6 · Sun Apr 21 2024 14:42:45 GMT+0800 (China Standard Time)

We can pass information (JSON) by setting output parameter: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions?tool=bash#setting-an-output-parameter

A matrix can be generated from a json object: https://docs.github.com/en/actions/learn-github-actions/expressions#fromjson

Anirban Ray · Answer 7 · Fri Apr 26 2024 16:09:24 GMT+0800 (China Standard Time)

Sorry, was not active last few days. @fkiraly then I think we can try this as per @Xinyu-Wu-0000's suggestion? This is a high level implementation idea:

run script to identify all modified estimators (current subset logic to identify when to run a test)

identify dependencies of these estimators

create a job matrix of unique dependencies (e.g. statsmodels, neuralforecast, huggingface+chronos, etc.)

in each job, run all the tests on modified estimators across several OS/Python versions

The tricky part is step 3, which is where we will try the above suggestion. We can create a JSON of different estimators and the dependencies, and then that JSON will create the matrix.

Franz Király · Answer 8 · Fri Apr 26 2024 18:06:01 GMT+0800 (China Standard Time)

Hm, so that would require:

specifying the format for the json
writing python code that creates the json? (that's the only way that I can see which does not introduce substantial manual maintenance burden)

Anirban Ray · Answer 9 · Fri Apr 26 2024 22:09:54 GMT+0800 (China Standard Time)

Yes. Hopefully @Xinyu-Wu-0000 can help with the JSON format, and then it should be easy to create at the end of the python script that detects "affected" estimators and their dependencies.

Xinyu Wu · Answer 10 · Sat Apr 27 2024 10:45:37 GMT+0800 (China Standard Time)

Maybe this will work:

{
    "include": [
        {
            "test alias": "foo",
            "estimators": [
                "NeuralForecastRNN"
            ],
            "dependencies": [
                "neuralforecast==1.7.0",
                "statsmodels==0.14.1"
            ]
        },
        {
            "test alias": "bar",
            "estimators": [
                "SimpleRNNRegressor"
            ],
            "dependencies": [
                "tensorflow"
            ]
        }
    ]
}

Franz Király · Answer 11 · Sat Apr 27 2024 15:58:43 GMT+0800 (China Standard Time)

dependencies should be easy to get for an estimator if we can use python - I wrote the utility deps from registry with precisely this in mind.

[MNT] lack of test coverage of `pandas 2.2.X` *and* deep learning backends

[MNT] lack of test coverage of `pandas 2.2.X` and deep learning backends