MLJFair is a comprehensive bias audit and mitigation toolkit in julia. Extensive support and functionality provided by MLJ has been used in this package.
using Pkg
Pkg.activate("my_environment", shared=true)
Pkg.add("https://github.com/ashryaagr/MLJFair.jl")
Pkg.add("MLJ")
- As of writing, it is the only bias audit and mitigation toolkit to support data with multi-valued protected attribute. For eg. If the protected attribute, say race has more than 2 values: "Asian", "African", "American"..so on, then MLJFair can easily handle it with normal workflow.
- Due to the support for multi-valued protected attribute, intersectional fairness can also be dealt with this toolkit. For eg. If the data has 2 protected attributes, say race and gender, then MLJFair can be used to handle it by combining the attributes like "female_american", "male_asian"...so on.
- Extensive support and functionality provided by MLJ can be leveraged when using MLJFair.
- Tuning of models using MLJTuning from MLJ. Numerious ML models from MLJModels can be used together with MLJFair.
- It leverages the flexibility and speed of Julia to make it more efficient and easy-to-use at the same time
- Well structured and intutive design
- Extensive tests and Documentation
- Documentation is a good starting point for this package.
- To understand MLJFair, it is recommended that the user goes through the MLJ Documentation. It shall help the user in understanding the usage of machine, evaluate, etc.
Following is an introductory example of using MLJFair. Observe how easy it has become to measure and mitigate bias in Machine Learning algorithms.
using MLJFair, MLJ
X, y, ŷ = @load_toydata
julia> model = ConstantClassifier()
ConstantClassifier() @904
julia> wrappedModel = ReweighingSamplingWrapper(model, grp=:Sex)
ReweighingSamplingWrapper(
grp = :Sex,
classifier = ConstantClassifier(),
noSamples = -1) @312
julia> evaluate(
wrappedModel,
X, y,
measures=MetricWrappers(
[true_positive, true_positive_rate]; grp=:Sex))
┌────────────────────┬─────────────────────────────────────────────────────────────────────────────────────┬───────────────────────────────────── ⋯
│ _.measure │ _.measurement │ _.per_fold ⋯
├────────────────────┼─────────────────────────────────────────────────────────────────────────────────────┼───────────────────────────────────── ⋯
│ true_positive │ Dict{Any,Any}("M" => 2,"overall" => 4,"F" => 2) │ Dict{Any,Any}[Dict("M" => 0,"overall ⋯
│ true_positive_rate │ Dict{Any,Any}("M" => 0.8333333333333334,"overall" => 0.8333333333333334,"F" => 1.0) │ Dict{Any,Any}[Dict("M" => 4.99999999 ⋯
└────────────────────┴─────────────────────────────────────────────────────────────────────────────────────┴───────────────────────────────────── ⋯
MLJFair is divided into following components
It is a 3D matrix of values of TruePositives, False Negatives, etc for each group. It greatly helps in optimization and removing the redundant calculations.
Name | Metric Instances |
---|---|
True Positive | truepositive, true_positive |
True Negative | truenegative, true_negative |
False Positive | falsepositive, false_positive |
False Negative | falsenegative, false_negative |
True Positive Rate | truepositive_rate, true_positive_rate, tpr, recall, sensitivity, hit_rate |
True Negative Rate | truenegative_rate, true_negative_rate, tnr, specificity, selectivity |
False Positive Rate | falsepositive_rate, false_positive_rate, fpr, fallout |
False Negative Rate | falsenegative_rate, false_negative_rate, fnr, miss_rate |
False Discovery Rate | falsediscovery_rate, false_discovery_rate, fdr |
Precision | positivepredictive_value, positive_predictive_value, ppv |
Negative Predictive Value | negativepredictive_value, negative_predictive_value, npv |
Name | Formula | Value for Custom function (func) |
---|---|---|
disparity | metric(Gᵢ)/metric(RefGrp) ∀ i | func(metric(Gᵢ), metric(RefGrp)) ∀ i |
parity | [ (1-ϵ) <= dispariy_value[i] <= 1/(1-ϵ) ∀ i ] | [ func(disparity_value[i]) ∀ i ] |
These metrics shall use either parity or shall have custom implementation to return boolean values
Metric | Aliases |
---|---|
Demographic Parity | DemographicParity |
These algorithms are wrappers. These help in mitigating bias and improve fairness.
Algorithm Name | Metric Optimised | Supports Multi-valued protected attribute | Type | Reference |
---|---|---|---|---|
Reweighing | General | ✔️ | Preprocessing | Kamiran and Calders, 2012 |
Reweighing-Sampling | General | ✔️ | Preprocessing | Kamiran and Calders, 2012 |
Equalized Odds Algorithm | Equalized Odds | ✔️ | Postprocessing | Hardt et al., 2016 |
LinProg Algorithm | Any metric | ✔️ | Postprocessing | Our own algorithm |
Meta-Fair algorithm[WIP] | Any metric | ✔️ | Inprocessing | Celis et al.. 2018 |