Create dataset loader for multilingual-NLI-26lang-2mil7

Question

SamuelCahyawijaya opened this issue 3 months ago · comments

Dataset	multilingual_nli_26lang
Description	This dataset contains 2 730 000 NLI text pairs in 26 languages spoken by more than 4 billion people. The dataset can be used to train models for multilingual NLI (Natural Language Inference) or zero-shot classification. The dataset is based on the English datasets MultiNLI, Fever-NLI, ANLI, LingNLI and WANLI and was created using the latest open-source machine translation models.
Subsets	-
Languages	ind, vie, eng
Tasks	Natural Language Inference
License	Unknown (unknown)
Homepage	https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7
HF URL	https://huggingface.co/datasets/MoritzLaurer/multilingual-NLI-26lang-2mil7
Paper URL	https://www.cambridge.org/core/journals/political-analysis/article/less-annotating-more-classifying-addressing-the-data-scarcity-issue-of-supervised-machine-learning-with-deep-transfer-learning-and-bertnli/05BB05555241762889825B080E097C27

Akhdan Fadhilah · Answer 1 · Mon Apr 01 2024 18:56:38 GMT+0800 (China Standard Time)

#self-assign