cross validation
thinking024 opened this issue · comments
In each iteration, load5foldData
is called to get 5 different datasets. Each dataset, consisting of the training set and the testing set, is used to train and test the model.
But those 5 datasets are generated from the same data. Thus, the data in fold0_x_test
to test the model may be in fold1_x_train
and it's used to train the model again, which means the model could learn from the fold$n$_x_test
in each iteration. The whole data is used for training but there is no other separate testing set to evaluate the model, resulting in significantly higher accuracy.