abumafrim / YOSM

YOSM: A NEW YORUBA SENTIMENT CORPUS FOR MOVIE REVIEWS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This repository contains the code for training movie review sentiment classification and the YOSM data for Yorùbá language. To run the code, see any of the bash scripts (*.sh)

The code is based on HuggingFace implementation (License: Apache 2.0).

The license of the data is in CC-BY-4.0.

Required dependencies

  • python
    • transformers : state-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.
    • sklearn : for F1-score evaluation
    • ptvsd : remote debugging server for Python support in Visual Studio and Visual Studio Code.
pip install transformers scikit-learn ptvsd

If you make use of this dataset, please cite us:

BibTeX entry and citation info

@article{shode_africanlp,
    author = {Shode, Iyanuoluwa and Adelani, David Ifeoluwa and Feldman, Anna},
    title = "{YOSM: A new Yorùbá Sentiment Corpus for Movie Reviews}",
    journal = {AfricaNLP 2022 @ICLR},
    year = {2022},
    month = {4},
    url = {https://openreview.net/forum?id=rRzx5qzVIb9},
}

About

YOSM: A NEW YORUBA SENTIMENT CORPUS FOR MOVIE REVIEWS


Languages

Language:Python 87.3%Language:Jupyter Notebook 7.2%Language:Shell 5.5%