Why so many sentences in your nli datasets are grammarly incorrect?
leoozy opened this issue · comments
Junlei Zhang commented
Thank you for your excellent job. I am running the supervised setting and find that many sentences in your nli dataset are grammarly incorrect. Such as :", heritage assets, Federal mission PP&E), uncertain historical cost basis ". The SNLI and MNLI dataset are human labeled dataset and do not have such sentences I guess. Do you have some post-processing of these sentences ? Thank you!
Tianyu Gao commented
Hi,
We directly take the SNLI and MNLI datasets and that might be some noise from the dataset.
Junlei Zhang commented
Thank you for your help!