remove duplicate
Ski-ing opened this issue · comments
Is there any strict operation to remove duplicate data between training data and test set human-eval before training?
I guess they didn't
We have checked the SFT training set. The HumanEval test set does not leak in it.
I would like to ask if there are any plans to open source the training data?