IndoNLP

IndoNLP

Organization data from Github https://github.com/IndoNLP

We are researchers who push up the lower bound of the Indonesian NLP standard. We are collaborating to release new data resources and benchmarks.

GitHub:@IndoNLP

IndoNLP's repositories

indonlu

The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:629Issues:16Issues:36

nusa-crowd

A collaborative project to collect datasets in Indonesian languages.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:275Issues:6Issues:191

nusax

High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:107Issues:9Issues:0

indonlg

The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code! (EMNLP 2021)

Language:PythonLicense:Apache-2.0Stargazers:76Issues:5Issues:1

nusa-writes

NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:27Issues:6Issues:2

cendol

Indonesian T0 | Instruction-tuning for low-resource and extremely low-resource Austronesian languages

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:14Issues:7Issues:0

nusa-catalogue

Dataset Catalogue Homepage for Indonesian Languages

Language:JavaScriptLicense:Apache-2.0Stargazers:10Issues:5Issues:5

nusacrowd-asr

NusaCrowd ASR Experiment

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3Issues:4Issues:0

.github

Landing page

Language:SCSSLicense:Apache-2.0Stargazers:1Issues:4Issues:0