Fairness-GAN

Fairness Project using GANs to generate fair data representations applied to the Brazilian context

Usage

Python Virtual Environment

Install venv
```
pip install venv
```
Activate Virtual Environment
```
source venv/bin/activate
```
Instal requirements
```
pip install -r requirements.txt
```

Anaconda

Create a anaconda environtment from environment.yml
```
conda env create -f environment.yml
```
Activate the new environment
```
conda activate pytorch
```
Run the main.py
```
chmod +x main.py
./main.py --help
```

Fairness

In order to get a 'quantitative' measure of how fair our classifier is, we take inspiration from the U.S. Equal Employment Opportunity Commission (EEOC). They use the so-called 80% rule to quantify the disparate impact on a group of people of a protected characteristic. Zafar et al. show in their paper "Fairness Constraints: Mechanisms for Fair Classification" how a more generic version of this rule, called the p%-rule, can be used to quantify fairness of a classifier. This rule is defined as follows:

A classifier that makes a binary class prediction $\hat{y} \in \{0,1\}$ given a binary sensitive attribute $z \in \{0,1\}$ satisfies the p%-rule if the following inequality holds:

$\min \left(\frac{P\left(y^{\wedge}=1 | z=1\right)}{P\left(y^{\prime}=1 | z=0\right)}, \frac{P\left(y^{\wedge}=1 | z=0\right)}{P\left(y^{\prime}=1 | z=1\right)}\right) \geq \frac{p}{100}$

The rule states that the ratio between the probability of a positive outcome given the sensitive attribute being true and the same probability given the sensitive attribute being false is no less than p:100. So, when a classifier is completely fair it will satisfy a 100%-rule. In contrast, when it is completely unfair it satisfies a 0%-rule.

In determining the fairness our or classifier we will follow the EEOC and say that a model is fair when it satisfies at least an 80%-rule. So, let's compute the p%-rules for the classifier and put a number on its fairness. Note that we will threshold our classifier at 0.5 to make its prediction it binary.

Neural Networks

The Fairness-GAN is composed of 4 neural networks:

Encoder: that takes the maping $X$ to $\tilde{X}$
Decoder: that reconstruct $\tilde{X}$ to $X$
Classifier: that tries to predict $Y$ from $\tilde{X}$ , by mapping $\tilde{X}$ to $\hat{Y}$
Discriminator: that tries to predict if $\tilde{X}$ has $Z$