Number of references used for training and testing
senjed opened this issue · comments
I noticed that the generate_input_dart.py only used 3 references for evaluation. However, some examples have many more references. I was wondering if you could provide more details about the results in the paper. Can't seem to replicate your results. Also if you can share the fine-tuned T5 and Bart model on dart this would be very helpful.