bert text-classification human-va fine-tuning roberta xlnet-pytorch multilabel-classification natural-language-processing nlp

Human Values Detection Behind Arguments

In the current work, we are tackling Task 4 of the Touché Competition: Human Value Detection 2023

Useful resources

Model	Notebook
SVM
BERT, RoBERTa, DistilBERT
XLNet

Task

The task consits of a multilabel text classification. Given a textual argument and a human value category, classify whether or not the argument draws on that category.

Arguments are given as a triplet:

Conclusion: Conclusion text of the argument
Stance: Stance of the Premise towards the Conclusion; one of "in favor of", "against"
Premise: Premise text of the argument

Data

We are using the data available on Zenodo. We are referring only to the following data: arguments-training.tsv, arguments-validation, labels-training.tsv, labels-validation.tsv

NOTE: Since test data is provided without labels, we did not consider it for our analysis. In this regards the performances of the tested models have been evaluated only on the validation data.

Tested models

SVM
BERT-base
BERT-large
RoBERTa-base
RoBERTa-large
DistilBERT
XLNet-base
XLNet-large

Results

Compared to the original paper, the macro-averaged F1 Score has been improved:

by more 20% for the SVM model
up to 47% for transformers

MODEL	SVM	BERT-base	BERT-large	DistilBERT	RoBERTa-base	RoBERTa-large	XLNet-base	XLNet-large
F1 avg macro (validation)	0.37	0.42	0.44	0.43	0.47	0.50	0.44	0.50

About

This GitHub repository presents our solution to Touché 2023 Task 4: Human Value Detection, a multilabel text classification task. We fine-tuned transformer architectures like BERT, RoBERTa, and XLNet to classify whether or not a given argument draws on a human value category.

bert text-classification human-va fine-tuning roberta xlnet-pytorch multilabel-classification natural-language-processing nlp

MIT License

Languages

Language:Jupyter Notebook 100.0%