Code repo of the final LFD paper
conda create -n lfdenvg8 python=3.9
conda activate lfdenvg8
pip install -r requirements.txt
The best performing T5-explanation prompt-3 model can be downloaded from here. Follow the instructions in this readme below to run the saved model for evaluation and prediction.
-
Discriminatory models are present inside "baselines/" folder
-
Generative models (our approach) are present in "t5-scripts/" folder
├── data
├── baselines
│ ├── neural
│ │ ├── **bert/**
│ │ ├── **lstm/**
│ │ └── utils.py
│ └── **svm**
└── t5-scripts
├── common_utils.py
├── **explain-w-t5/**
└── **t5-wo-explanation/**
├── lexicon_words
│ └── final_offensive_lexicon.txt
├── LICENSE
├── README.md
├── requirements.txt
- Navigate to "baselines/" folder for running them. The README.md inside it will guide on how to train/evaluate/predict.
- For running the model with explanations, navigate to
cd t5-scripts/explain-w-t5/
-
Selecting a template: This folder contains multiple
templatefile_{1,2,3,4,6}.py
template files. Decide which template you want to try out. Currently, we use the template (template 3) that gave us good results (table in the end). If you want to try other templates, updatespecific_utils.py
with the filename you want while importing TemplateHandler fromtemplatefile_{1,2,3,4,6}.py
. For example, change the following inspecific_utils.py
for template-1.From
from templatefile_3 import TemplateHandler
to
from templatefile_1 import TemplateHandler
-
Run the training script
python train.py --train_file ../../data/train.tsv --dev_file ../../data/dev.tsv --learning_rate 1e-4 --batch_size 8 --num_epochs 5 --max_seq_len 150 --langmodel_name t5-base --offensive_lexicon ../lexicon_words/final_offensive_lexicon.txt --ckpt_folder ./t5explain-files/ --seed 1234 --device cpu
-
The best model will be stored in t5explain-files/best-model.ckpt
-
Evaluating the model on the test file
python evaluate.py --test_file ../../data/test.tsv --best_modelname t5explain-files/bestmodel.ckpt --offensive_lexicon ../lexicon_words/final_offensive_lexicon.txt --batch_size 16 --device cpu
- Getting the predictions into a file
python predict.py --test_file ../../data/test.tsv --best_modelname t5explain-files/bestmodel.ckpt --offensive_lexicon ../lexicon_words/final_offensive_lexicon.txt --batch_size 16 --device cpu --output_predfile preds.txt
Prompt | Macro-F1 |
---|---|
prompt-1 the model observed classified offensive since the following words showed up the model observed classified not offensive |
78.93 +_ 0.9 |
prompt-2 we had several words that rendered this offensive they were we had no words that rendered this offensive, they are non-existent! |
79.64 +_ 0.45 |
prompt-3 which words made us decide this is offensive, you ask? here you go: which words made us decide this is not offensive, you ask? |
78.53 +_ 0.465 |
prompt-4 the provided sentence may be interpreted as offensive by some users as certain offensive words occur such as the provided sentence may not be found offensive by most users. |
79.51 +_ 0.6 |