Debiasing Multiclass Word Embeddings

This repository contains the code that was written in support of the NAACL 2019 paper Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings.

The repository has three main components.

Performing debiasing and MAC score calculations (./Debiasing/debias.py)
Cluster bias analysis (./Debiasing/neighborAnalysis.py)
Downstream evaluation (./Downstream/BiasEvalPipelineRunner.ipynb)

Debiasing & MAC

In order to run these files several data files need to be downloaded. If you would like to replicate our results from scratch, you must download the following files.

The Reddit.l2 Corpus

If you would like to replicate our results without training your own Word2Vec embeddings, we provide pretrained Word2Vec embeddings. Note that the results in the paper were on w2v_0.

Pretrained Baseline Word2Vecs w2v_0, w2v_1, w2v_2, w2v_3, w2v_4
Word2Vecs which have been debiased using hard debiasing for gender, race, and religion - All based on w2v_0.
Word2Vecs which have been debiased using soft debiasing for gender, race, and religion - All based on w2v_0.

If you are replicating our results from scratch, you can train your Word2Vecs using ./Debiasing/word2vec.py. Note that this file will generate Word2Vec embeddings for all the corpus files in the folder ./Debiasing/reddit.l2/*

Once you have trained your word embeddings you can evaluate you word embeddings using debais.py. Running debias.py requires the following command line arguments

embeddingPath : The path to the word2vec embeddings
vocabPath : The path to the social stereotype vocabulary
mode : The type of social bias data that should be used when performing analogy generation and debiasing
-hard : If this flag is set hard debiasing will be performed
-soft : If this flag is used then soft debiasinng will be performed
-k : An integer which denotes how many principal components that are used to define the bias subspace (Defaults to 1).
-w : If this flag is used then all the output of the analogy tasks, the debiased Word2Vecs and the MAC statistics will be written to disk in an folder named "./output"
-v : If this flag is used then the debias script will execute in verbose mode
-analogies : If this flag is used then analogies will be generated.
-printLimit : An integer which defines how many of each type of analogy will be printed when executing in verbose mode

Our vocabularies for bias detection and removal can be found under ./Debiasing/data/vocab.

Example commands are included below for reference

This commmand performs hard religious debiasing based on attributes in the passed vocab file. Verbose mode is used and the first 2 PCA components are used for debiasing

python debias.py 'data/w2vs/reddit.US.txt.tok.clean.cleanedforw2v.w2v' 'data/vocab/religion_attributes_optm.json' attribute -v -hard -k 2

This commmand performs hard & soft gender debiasing based on roles in the passed vocab file. Verbose mode is used and 500 analogies are printed for each embedding space (biased/hard debiased/soft debiased)

python debias.py 'data/w2vs/reddit.US.txt.tok.clean.cleanedforw2v.w2v' 'data/vocab/gender_attributes_optm.json' role -v -hard -soft -printLimit 500 -analogies

Cluster Bias Analysis

To run the cluster bias analysis (based on Gonen and Goldberg (2019)), run the following:

python neighborAnalysis.py <biased embeddings> <debiased embeddings> <debiasing info> <targets>

For example:

python neighborAnalysis.py 'data/w2vs/reddit.US.txt.tok.clean.cleanedforw2v.w2v' 'output/data_race_attributes_optm_json_role_hardDebiasedEmbeddingsOut.w2v' 'data/vocab/race_attributes_optm.json' 'professions.json' --multi

Arguments/flags:

targets: a JSON file containing a list of target words to evaluate bias on, such as Bolukbasi's professions.json
--bias_specific: a JSON file containing a list of words that inherently contain bias (for example, he and she for gender) and should be ignored. For example, here, or data/vocab/religion_specific.json.
--multi: Set for multiclass debiasing
-v: Verbose mode

Downstream Evaluation

In order to replicate our results you must use the embeddings that were generated in the Debiasing section (or you can simply download our pretrained and predebiased embeddings). These embeddings should be stored in ./Downstream/data/wvs/.

Additionally, you must download the conll2003 dataset. This data should be segmented into train, test, and val files which should be stored in ./Downstream/data/conll2003/.

After these files have been placed in the appropriate locations, you can replicate our results by running the ipython notebook ./Downstream/BiasEvalPipelineRunner.ipynb

Requirements

The following python packages are required (Python 2).

numpy 1.14.5
scipy 1.1.0
gensim 3.5.0
sklearn 0.19.2
pytorch 0.4.0
matplotlib 2.2.3
jupyter 1.0.0

Citing

If you found this repository or our paper helpful please consider citing us with this bibtex.

@article{manzini2019black,
  title={Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings},
  author={Manzini, Thomas and Lim, Yao Chong and Tsvetkov, Yulia and Black, Alan W},
  journal={arXiv preprint arXiv:1904.04047},
  year={2019}
}

TManzini / DebiasMulticlassWordEmbedding