Fairness Project using GANs to generate fair data representations applied to the Brazilian context
-
Install
venv
pip install venv
-
Activate Virtual Environment
source venv/bin/activate
-
Instal requirements
pip install -r requirements.txt
-
Create a anaconda environtment from
environment.yml
conda env create -f environment.yml
-
Activate the new environment
conda activate pytorch
-
Run the main.py
chmod +x main.py ./main.py --help
In order to get a 'quantitative' measure of how fair our classifier is, we take inspiration from the U.S. Equal Employment Opportunity Commission (EEOC). They use the so-called 80% rule to quantify the disparate impact on a group of people of a protected characteristic. Zafar et al. show in their paper "Fairness Constraints: Mechanisms for Fair Classification" how a more generic version of this rule, called the p%-rule, can be used to quantify fairness of a classifier. This rule is defined as follows:
A classifier that makes a binary class prediction given a binary sensitive attribute satisfies the p%-rule if the following inequality holds:
The rule states that the ratio between the probability of a positive outcome given the sensitive attribute being true and the same probability given the sensitive attribute being false is no less than p:100. So, when a classifier is completely fair it will satisfy a 100%-rule. In contrast, when it is completely unfair it satisfies a 0%-rule.
In determining the fairness our or classifier we will follow the EEOC and say that a model is fair when it satisfies at least an 80%-rule. So, let's compute the p%-rules for the classifier and put a number on its fairness. Note that we will threshold our classifier at 0.5 to make its prediction it binary.
The Fairness-GAN is composed of 4 neural networks:
- Encoder: that takes the maping to
- Decoder: that reconstruct to
- Classifier: that tries to predict from , by mapping to
- Discriminator: that tries to predict if has
Bla Bla Bla
Bla Bla Bla
Bla Bla Bla
- Loss function:
- : regularizer that forces the classifier towards fairer predictions while sacrificing prediction accuracy
Decoder Error
ROC Curve Classifier
ROC Curve Discriminator