Covid-19 Misinformation Detection

This repository contains the code and data for the paper "Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection".

How to use

We are providing two ways to setup this project.

Getting Started

First install conda from here.

Next, create the conda environment with the requirements "conda env create -f tfrs_env.yml"

Training and testing

The following files and folders contain the code to reproduce the experiments from our paper:

bert_pytorch.py - Code for using BERT based embeddings for downstream task.
finetuning.py - Code for fine-tuning all models on all datasets.
intermediate_training/ - Contains the code to train models on CORD-19 with a pre-training objective (e.g, Masked Language Modeling)
supplemental_code/ - Code for creating plots for the paper and significance analysis.
baselines/ - Code for testing bi-LSTM baselines against transformer language models

How to cite

@inproceedings{Wahle2022a,
  title        = {{Testing} the {Generalization} of {Neural} {Language} {Models} for {COVID}-19 {Misinformation} {Detection}},
  author       = {Wahle, Jan Philip and Ashok, Nischal and Ruas, Terry and Meuschke, Norman and Ghosal, Tirthankar and Gipp, Bela},
  year         = 2022,
  month        = {February},
  booktitle    = {Proceedings of the iConference},
  location     = {Virtual Event},
}

About

The official implementation of the iConference 2022 paper "Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection"

https://link.springer.com/chapter/10.1007/978-3-030-96957-8_33

Languages

Language:Python 98.1%Language:Shell 1.9%