Neutralizing Biased Text

This repo contains code for the paper, "Automatically Neutralizing Subjective Bias in Text".

Concretely this means algorithms for

Identifying biased words in sentences.
Neutralizing bias in sentences.

This repo was tested with python 3.7.7.

Quickstart

These commands will download data and a pretrained model before running inference.

$ cd src/
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
$ python
>> import nltk; nltk.download("punkt")
$ sh download_data_ckpt_and_run_inference.sh

You can also run sh integration_test.sh to further verify that everything is installed correctly and working as it should be.

Data

Click this link to download (100MB, expands to 500MB).

If that link is broken, try this mirror: download link

Pretrained model

Click this link to download a (Modular) model checkpoint. We used this command to train it.

Overview

harvest/: Code for making the dataset. It works by crawling and filtering Wikipedia for bias-driven edits.

src/: Code for training models and using trained models to run inference. The models implemented here are referred to as MODULAR and CONCURRENT in the paper.

Usage

Please see src/README.md for bias neutralization directions.

See harvest/README.md for making a new dataset (as opposed to downloading the one available above).

About

Code and data for the paper, "Automatically Neutralizing Subjective Bias in Text"

MIT License

Languages

Language:Python 84.0%Language:Jupyter Notebook 14.8%Language:Shell 1.2%