TharinduDR / MUDES

Multilingual Detection of Offensive Spans

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

License PyPI version Downloads

MUDES - {Mu}ltilingual {De}tection of Offensive {S}pans

We provide state-of-the-art models to detect toxic spans in text. We have evaluated our models on Toxic Spans task at SemEval 2021 (Task 5).

Installation

You first need to install PyTorch. The recommended PyTorch version is 1.6. Please refer to PyTorch installation page regarding the specific install command for your platform.

When PyTorch has been installed, you can install MUDES from pip.

From pip

pip install mudes

Pretrained MUDES Models

We will be keep releasing new models. Please keep in touch. We have evaluated the models on the trial set released for Toxic Spanstask at SemEval 2021.

Models Average F1
en-base 0.6734
en-large 0.6886
multilingual-base 0.5953
multilingual-large 0.6013

Prediction

Following code can be used to predict toxic spans in text. Upon executing, it will download the relevant model and return the toxic spans.

from mudes.app.mudes_app import MUDESApp

app = MUDESApp("en-large", use_cuda=False)
print(app.predict_toxic_spans("You motherfucking cunt", spans=True))

System Demonstration

An experimental demonstration interface called MUDES-UI has been released on GitHub and can be checked out in here.

Citing & Authors

If you are using this repo, please consider citing these papers.

@inproceedings{ranasinghemudes,
 title={{MUDES: Multilingual Detection of Offensive Spans}}, 
 author={Tharindu Ranasinghe and Marcos Zampieri},  
 booktitle={Proceedings of NAACL},
 year={2021}
}
@inproceedings{ranasinghe2021semeval,
  title={{WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans}},
  author = "Ranasinghe, Tharindu  and Sarkar, Diptanu and Zampieri, Marcos and Ororbia, Alex",
  booktitle={Proceedings of SemEval},
  year={2021}
}

About

Multilingual Detection of Offensive Spans

License:Apache License 2.0


Languages

Language:Python 100.0%