Create dataset loader for multilingual-NLI-26lang-2mil7
SamuelCahyawijaya opened this issue · comments
Dataset | multilingual_nli_26lang |
---|---|
Description | This dataset contains 2 730 000 NLI text pairs in 26 languages spoken by more than 4 billion people. The dataset can be used to train models for multilingual NLI (Natural Language Inference) or zero-shot classification. The dataset is based on the English datasets MultiNLI, Fever-NLI, ANLI, LingNLI and WANLI and was created using the latest open-source machine translation models. |
Subsets | - |
Languages | ind, vie, eng |
Tasks | Natural Language Inference |
License | Unknown (unknown) |
Homepage | https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7 |
HF URL | https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7 |
Paper URL | https://www.cambridge.org/core/journals/political-analysis/article/less-annotating-more-classifying-addressing-the-data-scarcity-issue-of-supervised-machine-learning-with-deep-transfer-learning-and-bertnli/05BB05555241762889825B080E097C27 |
#self-assign