ALD

This repository contains the implementation of 'Automatic Location of Disparities' (ALD) for conducting algorithmic audits.

Installation

# install.packages("remotes")
remotes::install_github("https://github.com/moritzvz/ald")

ALD is dependent on several other packages for handling data, modeling, and generating reports: partykit, assertthat, magrittr, tidyselect, tibble, dplyr, tidyr, readr, rmarkdown, flextable, stringr, ggplot2, ggparty, cowplot, scales, hms

Usage

The ALD audit:

is performed on a dataset of your choice that must be provided as a .csv file
requires notion of fairness to be set to 'statistical parity' or 'equalized odds'

in case of 'statistical parity' you must set the outcome_variable argument to the name of the outcome variable in your dataset
for 'equalized odds' you must set the prediction_variable and ground_truth_variable arguments to the names of the prediction and ground truth variables in your dataset

by default all other variables (not outcome, prediction, ground truth) in you dataset will be used as sensitive attributes in the audit. You can use the sensitive_attributes argument to specifically set the sensitive attributes to a subset of your dataset varaiables
requires a ranking mechanism which must be 'confidence' or 'magnitude'
requires a maximum number of groups in the report (n_grp)
requires a number of trees to model in partykit::cforest (ntree)
requires a alpha argument passed to partykit::cforest (alpha)
optionally takes a p-value adjustment method to pass to stats::p.adjust (adjust_method), either "BH" (Benjamini-Hochberg, by default) or "bonferroni" (Bonferroni correction).
optionally takes a random seed number that can be used for reproducibility of results
writes a report to the directory that you set with the dir argument, with data_name argument used in the name

# for example
ald_audit(
  file                  = "my_data.csv",
  prediction_variable   = "prediction",
  ground_truth_variable = "ground_truth",
  notion_of_fairness    = "equalized odds",
  ranking_mechanism     = "confidence",
  data_name             = "data_title",
  dir                   = here::here(""),
  n_grp                 = 3,
  ntree                 = 25,
  alpha                 = 0.1)

ald_audit(
  file                  = "my_data.csv",
  outcome_variable      = "outcome",
  notion_of_fairness    = "statistical parity",
  ranking_mechanism     = "confidence",
  data_name             = "data_title",
  dir                   = here::here(""),
  n_grp                 = 3,
  ntree                 = 25,
  alpha                 = 0.1)

Citation

Please consider citing us if you find this helpful for your work:

@inproceedings{vonZahn.2023,
  title={Locating disparities in machine learning},
  author={von Zahn, Moritz and Hinz, Oliver and Feuerriegel, Stefan},
  booktitle={2023 IEEE International Conference on Big Data (BigData)},
  pages={1883--1894},
  year={2023},
  organization={IEEE}
}

moritzvz / ald

ALD

Installation

Usage

Citation

About

Languages