raamana / neuropredict

Easy and comprehensive assessment of predictive power, with support for neuroimaging features

Home Page:https://raamana.github.io/neuropredict/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make predictions in a new or held out dataset

raamana opened this issue · comments

Ability to input a new dataset, from a different site or dataset or country, and use the best model to report performance on this dataset

Or an option to specify attribute-based criterion to hold a certain subset out completely to report performance

An obvious issue to be solved is the definition of what the best model is — one parameter combination is only evaluated once, and a simple numerical comparison of accuracy isn’t a good/robust way pick it.

Best model could be defined by the Param combination that was most frequently selected over N>100 reps of the inner CV loop (I report it for user FYI), but often there are many within the same freq range of 30-40%, and we could employ some non-parametric stats there to pick one!

CLI option could be —report_on