sktime / skbase

Base classes for creating scikit-learn-like parametric objects, and tools for working with them.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ENH] in `all_objects` changing `filter_tags` by `str` and "list of str" to mean `{val: True}` rather than `val in keys()`

fkiraly opened this issue · comments

Currently, passing str or list of str to all_objects filter_tags argument will subset the estimators by the estimator having the tag - vs not having it.

This is counterintuitive, as - semantically - users will expect the subsetting by whether the tag is set to True, e.g., if they search for capability:missing_values, they want estimators that can handle missing values.

Unexpectedly, the estimators returned are all classes of types where "missing value handling" is a valid tag, not all estimators that can handle missing values, because it will also retrieve all cases where the tag is set to False.

I think this is such a severe violation of user expectation that we should change it.

Given that it is in a central utility, there should be a deprecation cycle accompanying it.
That is, for a release cycle, users passing str or list of str will be warned that the behaviour will change, with a message of how to keep current behaviour. We should probably also add callables x -> bool as options for fields or similar.

FYI @weenerplasticsgroup.