Right and left, partisanship predicts vulnerability to misinformation

Introduction
Datasets
Environment
Workflow
Contact

Introduction

In this repository, you can find code and instructions for reproducing the plots from Right and left, partisanship predicts vulnerability to misinformation by Dimitar Nikolov, Alessandro Flammini, and Filippo Menczer.

To start, clone the repo:

$ git clone https://github.com/dimitargnikolov/twitter-misinformation.git

You should run all subsequent commands from the directory where you clone the repo.

Datasets

There are three datasets you need to obtain. Before you begin, create a data directory at the root of the repo.

Link Sharing on Twitter

This dataset contains a set of link sharing actions that occurred on Twitter during the month of June 2017. The dataset is available on the Harvard Dataverse.

Political Valence

This is a dataset from Facebook, which gives political valence scores to several popular news sites. You can request access to the dataset from Dataverse. Once you have access, put the top500.csv file into the data directory.

Misinformation

This is a dataset of manually curated sources of misinformation available at OpenSources.co. Clone it from Github in your data directory.

$ git clone https://github.com/BigMcLargeHuge/opensources.git data/opensources

`data` Directory

Once you obtain all data as described above, your data directory should look like this:

data
├── domain-shares.data
├── opensources
│   ├── CONTRIBUTING.md
│   ├── LICENSE
│   ├── README.md
│   ├── badges.txt
│   ├── releasenotes.txt
│   └── sources
│       ├── sources.csv
│       └── sources.json
└── top500.csv

Environment

Make sure you have Python 3 installed on your system. Then, set up a virtualenv with the required modules at the root of the cloned repository:

$ virtualenv -p python3 VENV
$ source VENV/bin/activate
$ pip install -r requirements.txt

From now on, any time you want to run the analysis, activate your virtual environment with:

$ source VENV/bin/activate

Workflow

The replication code is contained in the .py files in the scripts directory. You can automate their execution with the provided snakemake workflow:

$ cd workflow
$ snakemake -p

The execution will display the actual shell commands being executed, so you can run them individually if you want. You can inspect the workflow/Snakefile file to see how the inputs and outputs for each script are specified. In addition, you can execute each script with

$ python <script_name.py> --help

to learn about what it does.

At the end of the execution, the generated plots will be in the data directory.

To regenerate the plots from scratch, in the workflow directory you can do:

$ snakemake clean
$ snakemake -p

Contact

If you have any questions about running this code or obtaining the data, please open an issue in this repository and we will get back to you as soon as possible.

About

Code for replicating "Right and left, partisanship predicts vulnerability to misinformation" by Dimitar Nikolov, Alessandro Flammini and Filippo Menczer

Languages

Language:Jupyter Notebook 94.4%Language:Python 5.5%Language:Shell 0.2%