tattle-made / Uli

Software and Resources for Mitigating Online Gender Based Violence in India

Home Page:https://uli.tattle.co.in

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Participatory Approaches to Building Datasets on Abuse

tarunima opened this issue · comments

Description:

Automated approaches to abuse detection rely on annotated datasets. At least at present, unsupervised machine learning alone cannot detect abuse across languages. To fill the gap of abuse detection datasets in India languages, Tattle started the Uli project to specifically create datasets on gendered abuse in Indian languages.But the focus is also to take a survivor centered perspective on abuse. The datasets was created with people of marginalized genders at the receiving end of abuse. The first dataset on abusive tweets helped us develop a methodology for participatory datasets that we would now like to extend to more languages and modalities.

The Scope of This Task:

  1. Review literature about datasets of abuse detection in images, videos and audio.
  2. Create a dataset of images from social media that could be annotated by the existing community of researchers, survivors, activists.
  3. Expand the community of annotators
  4. Qualitative research to define abuse in multimodal datasets
  5. Organize annotations
  6. Release the dataset.

This ticket should be treated as a statement of intent for a multi-year project. If you're interested in collaborating on this project, please leave a comment.

This issue is stale because it has been open for 30 days with no activity.

Hi! Is this task still considering participants? I am interested in volunteering.

I research Online Hate Speech in low-resource settings. I have experience in curating datasets for gender-based stereotypes and I have worked on Multi-Modal Audio Abuse Detection in Low Resource Settings.