liuchaoqun / hate-speech-datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hate-speech-datasets

This repository consist dataset mentioned in some papers.

Papers

  • dataset
  • problem
    • provide a large-scale hate speech dataset
  • performance (baseline)
  • dataset
  • problem
    • models tend to have biases over group identifiers but unable to learn from context, which leads to false positives.
  • methodology
    • propose a novel regularization technique (Regularizing SOC explanations of group identifiers) based on these explanations that encourages models to learn from the context of group identifiers in addition to the identifiers themselves.
  • performance
  • dataset
  • problem
    • most work has focused on explicit or overt hate speech, failing to ad- dress a more pervasive form based on coded or indirect language.
  • methodology
    • introduce a theoretical taxonomy of implicit hate speech
    • Provide a dataset for implicit hate speech.
    • Provide several state of the art baselines for detecting and explaining implicit hate speech.
  • performance
  • dataset
  • problem
    • The deep learning methods of predecessors often only used pre-trained models or deeper networks to obtain semantic features, ignoring the sentiment features of the target sentences and external sentiment resources, which also makes the performance of neural networks unsatisfactory in hate speech detection.
  • methodology
    • propose a hate speech detection framework based on sentiment knowledge sharing
  • performance
  • dataset
  • problem
    • Identify who is the target in a given hate speech post.
    • Identify what aspects (or characteristics) of the target are attributed to the target in the post.
  • methodology
  • problem:
  • Methodology: BERT + post-processing
  • Performance: • Our system significantly outperformed the pro- vided baseline and achieved an F1-score of 0.683, placing Lone Pine in the 17th place out of 91 teams in the competition.

About


Languages

Language:Jupyter Notebook 96.1%Language:Python 3.9%