sktime / sktime

A unified framework for machine learning with time series

Home Page:https://www.sktime.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[ENH] Implement callbacks for `evaluate` method

arnaujc91 opened this issue · comments

commented

Would be really helpful if we could include callbacks similarly to how keras does it. For example:

https://github.com/keras-team/keras/blob/a063684e1128944461809d923758f9a684e721de/keras/backend/tensorflow/trainer.py#L305-L332

Most probably i would start applying them here:

# dispatch by backend and strategy
if not_parallel:
# Run temporal cross-validation sequentially
results = []
for x in enumerate(yx_splits):
is_first = x[0] == 0 # first iteration
if strategy == "update" or (strategy == "no-update_params" and is_first):
result, forecaster = _evaluate_window(x, _evaluate_window_kwargs)
_evaluate_window_kwargs["forecaster"] = forecaster
else:
result = _evaluate_window(x, _evaluate_window_kwargs)
results.append(result)

would be useful to have as input to the callbacks the y and X for training as well as the predictions for every iteration.

Here is also the basic implementation of the keras callback class for inspiration:

https://github.com/keras-team/keras/blob/v3.1.1/keras/callbacks/callback.py#L5

as well as the class to store the list of callbacks:

https://github.com/keras-team/keras/blob/master/keras/callbacks/callback_list.py#L9

commented

@fkiraly @yarnabrina

Let me know if that makes sense to you. I would use these callbacks to log everything i need using mlflow in my case.

If your goal is to just track metrics for different combinations etc.using mlflow tracking, I think that's doable even now at end of the tuning process.

But in general, I agree with the idea and it will be nice to have. I don't know what all different default callbacks people will be interested into though, does early stop etc. intuitive for most sktime forecasters?

@arnaujc91, agreed that this is an excellent idea, it also bridges "classical" ML hyper-parameter tuning and tuning deep learning estimators.

Could you perhaps write some speculative code for one or two use cases that you'd like to have but are currently not possible?