joshchang / shapley-regression

For calculating Shapley values via linear regression.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Shapley Regression

This repository implements a regression-based approach to estimating Shapley values. Although the code can be used with any cooperative game, our focus is model explanation methods such SHAP, SAGE, and Shapley Effects, which are the Shapley values of several specific cooperative games. The methods provided here were developed in this paper.

Because approximations are essential in most practical Shapley value applications, we provide an estimation approach with the following convenient features:

  1. Convergence detection: the estimator stops automatically when it is approximately converged, so you don't need to specify the number of samples.

  2. Convergence forecasting: for use cases that take a long time to run, our implementation forecasts the amount of time required to reach convergence (displayed with a progress bar).

  3. Uncertainty estimation: Shapley values are often estimated rather than calculated exactly, and our method provides confidence intervals for the results.

Usage

To use the code, clone this repository and install the package into your Python environment:

pip install .

Next, to run the code you only need to do two things: 1) specify a cooperative game, and 2) run the Shapley value estimator. For example, you can calculate SHAP values as follows:

from shapreg import removal, games, shapley

# Get data
x, y = ...
feature_names = ...

# Get model (a callable object)
model = ...

# Set up the cooperative game (SHAP)
imputer = removal.MarginalExtension(x[:128], model)
game = games.PredictionGame(imputer, x[0])

# Estimate Shapley values
values = shapley.ShapleyRegression(game)

For examples, see the following notebooks:

  • Census: shows how to explain individual predictions (SHAP)
  • Credit: shows how to explain the model's loss (SAGE)
  • Bank: shows how to explain the model's global sensitivity (Shapley Effects)
  • Consistency: verifies that our different estimators return the same results
  • Calibration: verifies the accuracy of our uncertainty estimates

Authors

References

Ian Covert and Su-In Lee. "Improving KernelSHAP: Practical Shapley Value Estimation via Linear Regression." arxiv preprint:2012.01536

About

For calculating Shapley values via linear regression.

License:MIT License


Languages

Language:Python 100.0%