Why so many sentences in your nli datasets are grammarly incorrect?

Question

Why so many sentences in your nli datasets are grammarly incorrect?

leoozy opened this issue a year ago · comments

Thank you for your excellent job. I am running the supervised setting and find that many sentences in your nli dataset are grammarly incorrect. Such as :", heritage assets, Federal mission PP&E), uncertain historical cost basis ". The SNLI and MNLI dataset are human labeled dataset and do not have such sentences I guess. Do you have some post-processing of these sentences ? Thank you!

Tianyu Gao · Answer 1 · Wed Apr 05 2023 21:49:47 GMT+0800 (China Standard Time)

Hi,

We directly take the SNLI and MNLI datasets and that might be some noise from the dataset.

Junlei Zhang · Answer 2 · Wed Apr 05 2023 21:59:30 GMT+0800 (China Standard Time)

Thank you for your help!