Why is the validation and testing of anli dataset with rational have more than 1000 examples?

Question

Why is the validation and testing of anli dataset with rational have more than 1000 examples?

BalloutAI opened this issue 9 months ago · comments

I am looking at the files inside llm folder of anli1, the val_cot_0 has more than 1400 samples (I looked the number of "so the answer is" in the file) while the validation without rational has 1000 samples? Why is there difference?