bdwilliamson / spvim_supplementary

Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spvim_supplementary: Supplementary materials for the SPVIM paper

This repository contains the supplementary material for and code to reproduce the analyses in "Efficient nonparametric statistical inference on population feature importance using Shapley values" by Williamson and Feng (arXiv, 2020; to appear in the Proceedings of the Thirty-seventh International Conference on Machine Learning [ICML 2020]). All analyses were implemented in the freely available software packages Python and R; specifically, Python version 3.7.4 and R version 3.6.3.

This README file provides an overview of the code available in the repository.

Code directory

We have separated our code further into two sub-directories based on the two main objectives of the manuscript:

  1. Numerical experiments to evaluate the operating characteristics of our proposed method (sims).
  2. An analysis of patients' stays in the ICU from the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database (data_analysis).

All analyses were performed on a Linux cluster using the Slurm batch scheduling system. The head node of the batch scheduler allows the shorthand "ml" in place of "module load". If you use a different batch scheduling system, the individual code files are flagged with the line where you can change batch variables. If you prefer to run the analyses locally, you may -- however, these analyses will then take a large amount of time.


Issues

If you encounter any bugs or have any specific questions about the analysis, please file an issue.

About

Reproduce analyses from "Efficient nonparametric statistical inference on population feature importance using Shapley values"

License:MIT License


Languages

Language:Python 66.7%Language:R 31.6%Language:Shell 1.8%