[MiniLLM]Why dolly only has 12435 training samples?
yumath opened this issue · comments
but in your paper, Section 3.1
Training
We construct the training data from databricks-dolly-15k consisting of 15K human-
written instruction-response pairs. We randomly split 14K samples as the training set D and left
500 samples for validation and testing, respectively.
Thx very much