split_param in class HomedepotSplitter
great-thoughts opened this issue · comments
great-thoughts commented
The class HomedepotSplitter has a split_param=[0.5, 0.25, 0.5]. What do these number represents?
Is it:
split_param[0]: split unique search terms appear in dTrain into 50%train (df0) and 50%validation (df1)
split_param[1]: split the terms appearing in df0 into 25% in train and 75% as common between train and validation
split_param[2]: 50% of the 75% will be appended to the validation dataset?
How do those values relate to the ven diagram?