This repository contains the code and data for the paper "Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection".
We are providing two ways to setup this project.
Getting Started
First install conda from here.
Next, create the conda environment with the requirements "conda env create -f tfrs_env.yml"
Training and testing
The following files and folders contain the code to reproduce the experiments from our paper:
-
bert_pytorch.py - Code for using BERT based embeddings for downstream task.
-
finetuning.py - Code for fine-tuning all models on all datasets.
-
intermediate_training/ - Contains the code to train models on CORD-19 with a pre-training objective (e.g, Masked Language Modeling)
-
supplemental_code/ - Code for creating plots for the paper and significance analysis.
-
baselines/ - Code for testing bi-LSTM baselines against transformer language models
@inproceedings{Wahle2022a,
title = {{Testing} the {Generalization} of {Neural} {Language} {Models} for {COVID}-19 {Misinformation} {Detection}},
author = {Wahle, Jan Philip and Ashok, Nischal and Ruas, Terry and Meuschke, Norman and Ghosal, Tirthankar and Gipp, Bela},
year = 2022,
month = {February},
booktitle = {Proceedings of the iConference},
location = {Virtual Event},
}