Erland366 / COPAL-ID

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🏺COPAL-ID (Choice of Plausible Alternatives Local Nuances - Indonesia)

test

Welcome folks! 🎉🎉

This repository contains data from our research: COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances (To be Announced)!

Our dataset comprises 559 instances that tests Common Sense Reasoning (CSR). This task focuses on Indonesian local nuances and culture and is presented in COPA-style. Here are few examples of our data:

Premise Choice 1 Choice 2 Question Type Label
Penumpang angkutan umum ingin turun di jalan. Penumpang teriak "kanan" Penumpang teriak "kiri" effect Choice 2
Dia merasa masuk angin Dia membuka jendela untuk meperbaiki sirkulasi udara Dia meminta tolong untuk kerokan effect Choice 2
Kemarin malam, ia baru selesai jaga lilin. Ia adalah orang yang taat beribadah Ia percaya dengan ilmu hitam cause Choice 2
Ia dibawa ke kantor polisi akibat mencuri televisi Ia tertangkap basah membawa televisi Ia membawa televisi dengan tangan merah cause Choice 1

Data

Our data can be downloaded on Huggingface or you can just clone this repository and get the content of /data.

  1. test_copal.csv contains COPAL-ID
  2. test_copal_colloquial.csv contains the colloquial version of COPAL-ID

Further detailed information will be provided in the future!

Code

To be announced. Wait for us cleaning our code 🙏🙏, especially for messy stuffs.

For instance, we are cleaning stuffs that look like these: # DONT CHANGE THIS CODE OR IT WILL BREAK, print('TESTTESTTEST'), print("CAT A MEONG MEOW"), how_should_i_name_this_var=123.

Team

Cultured

  1. Haryo Akbarianto Wibowo @ MBZUAI
  2. Erland Hilman Fuadi @ Independent Researcher
  3. Made Nindyatama Nityasya @ Independent Researcher
  4. Radityo Eko Prasojo @ Pitik
  5. Alham Fikri Aji @MBZUAI

About

License:Creative Commons Attribution Share Alike 4.0 International