Number of references used for training and testing

Question

Number of references used for training and testing

senjed opened this issue 3 years ago · comments

I noticed that the generate_input_dart.py only used 3 references for evaluation. However, some examples have many more references. I was wondering if you could provide more details about the results in the paper. Can't seem to replicate your results. Also if you can share the fine-tuned T5 and Bart model on dart this would be very helpful.