Proposal for training hooks

Question

Proposal for training hooks

lintool opened this issue 5 years ago · comments

As a recap, here's where we're at: we have the init, index, and search hooks fleshed out reasonably well, with the following "canonical" execution sequence:

[init] -> [index: save-to-snapshot] -> [load-from-snapshot: search]

A proposal to add in a training hook is as follows:

[init] -> [index: save-to-snapshot1] -> [load-from-snapshot1: train: save-to-snapshot2]
       -> [load-from-snapshot2: test]

The train hook would get the topics and qrels as input. As part of the contract, the jig would manage the snapshotting, so the from the perspective of the container, it would be as if, init, index, train, test ran in an uninterrupted sequence.

The snapshotting allows the jig to efficiently retrain different models (if the image supports it), and to test on different held-out test sets.

Also, we would propose a cross-validation hook, e.g.,

[init] -> [index: save-to-snapshot] -> [load-from-snapshot: xvalidate]

The input to the cross-validation hook would contain the topic, qrels, and folds.

Thoughts?

albpurpura · Answer 1 · Wed Apr 24 2019 14:34:23 GMT+0800 (China Standard Time)

Thanks for the recap!
For supervised models or other approaches which implement a best model selection (such as nvsm) the train hook should receive an indication of which topics to use in the training and validation steps and the qrels too. In nvsm we handle that via a list of topic ids parsed from an external file, our hook therefore receives a path to the training and validation ids lists and the path of the qrels file (to perform best model selection according to the validation subset).

Cross validation imho is a good option to add (with the same considerations I made above) for models which do not require a long time to train.

Jimmy Lin · Answer 2 · Wed Apr 24 2019 17:30:27 GMT+0800 (China Standard Time)

@albpurpura

the train hook should receive an indication of which topics to use in the training and validation steps and the qrels too.

Yes, that's exactly the plan.

In my mind, there are two ways a team can implement training and/or cross-validation in their image:

actually do the training.
"fake it".

By (2), I mean, an image checks the input of the training/validation topics, and selects an appropriate pre-trained model to use. This is doable because I assume we'll have some sort of standard fold setting. The hook can just return an error if it is given an "unknown" split.

Obviously (1) should be preferable, but I think (2) would be acceptable also - i.e., better than nothing.

Reactions?

albpurpura · Answer 3 · Wed Apr 24 2019 19:37:06 GMT+0800 (China Standard Time)

@lintool I agree with you on the implementation of the "fake" training. One could load a pre-trained model that we'd share and if that is provided then no actual training is performed, otherwise, we train a new model.

You also wrote:

This is doable because I assume we'll have some sort of standard fold setting.

Do you mean that we could provide a standard split of the topics for training and validation (and consequently test) for all systems? That is a good idea which would make the performance of different models comparable.

Loading pre-trained models for cross validation could also be done but only if the folds are provided in advance and of course that would be the only supported way to xvalidate a system with pre-trained models. We should also consider, before deciding to do this, that more pre-trained models will take up lots of space.

Jimmy Lin · Answer 4 · Wed Apr 24 2019 19:47:28 GMT+0800 (China Standard Time)

For example, for Robust04 I would suggest the two-fold and five-fold settings here:
https://github.com/castorini/anserini/blob/master/docs/experiments-forum2018.md

two fold (which is just even/odd): https://github.com/castorini/anserini/blob/master/src/main/resources/fine_tuning/robust04-paper1-folds.json
five fold (which is just round robin) https://github.com/castorini/anserini/blob/master/src/main/resources/fine_tuning/robust04-paper2-folds.json

This would make our results comparable to previous work.

If the image gets a fold configuration it doesn't recognize (and doesn't have a trained model for)... it can just throw an error.