marjanhs / stance

a stance detection model based on pre-trained language model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Accurate information from both sides of the contemporary issues is known to be an `antidote in confirmation bias'. While these types of information help the educators to improve their vital skills including critical thinking and open-mindedness, they are relatively rare and hard to find online. With the well-researched argumentative opinions (arguments) on controversial issues shared by Procon.org in a nonpartisan format, detecting the stance of arguments is a crucial step to automate organizing such resources. We use a universal pretrained language model with weight-dropped LSTM neural network to leverage the context of an argument for stance detection on the proposed dataset. Experimental results show that the dataset is challenging, however, utilizing the pretrained language model fine-tuned on context information yields a general model that beats the competitive baselines. We also provide analysis to find the informative segments of an argument to our stance detection model and investigate the relationship between the sentiment of an argument with its stance.

The paper is accepeted at SBP-BRiMS 2019. Preprint version, Springer version

Dataset:

The procon dataset is purely for academic/research use and not for commercial purposes. Rights to the data belong to procon.org as they hosted the data.

Please contact me (email:) to request the dataset!

Requirements

-fastai 2018 release (version 1-.0.6 or later in 2018)

-nltk

-PyTorch

Preprocessing

procon_ai_utils.py converts the dataset files to token and token-id files and creates the itos/stoi files.

Model

__main__.py fine-tunes pre-trained language model by fastai, trains and evalates the model

About

a stance detection model based on pre-trained language model


Languages

Language:Python 100.0%Language:Shell 0.0%