Sungai (pronounced soon-nai) means river in Malay and is a sample multilingual dataset. It is meant to be used for NLP multilingual model distillation. "mdd" stands for "multilingual distillation dataset".
Sample multilingual data and tools for creating the data - used for NLP multilingual NLP research
Sungai (pronounced soon-nai) means river in Malay and is a sample multilingual dataset. It is meant to be used for NLP multilingual model distillation. "mdd" stands for "multilingual distillation dataset".
Sample multilingual data and tools for creating the data - used for NLP multilingual NLP research
Apache License 2.0