This repository contains the code to the paper "Estimation of Causal Effects in the Presence of Unobserved Confounding in the Alzheimer's Continuum." If you are using this code, please cite:
@inproceedings(Poelsterl2021-causal-effects-in-ad,
title = {{Estimation of Causal Effects in the Presence of Unobserved Confounding in the Alzheimer's Continuum}},
author = {P{\"{o}}lsterl, Sebastian and Wachinger, Christian},
booktitle = {Information Processing in Medical Imaging},
year = {2021},
pages = {45--57},
doi = {10.1007/978-3-030-78191-0_4},
url = {https://arxiv.org/abs/2006.13135},
}
- Install Docker.
- Build Docker image
causalad
:
docker build -t causalad .
If you want to use the code for development, you can use conda
to create an environment with all dependencies from requirements.yaml
.
This section provides an overview on how to obtain the data to
reproduce the results presented in the paper. As data cannot
be shared publicly, you will have to perform the data processing
yourself to fill in the missing values of data/adni-data-template.csv
and data/ukb-data-template.csv
. These files list the patient ID,
and visit and image ID for ADNI, which uniquely identify the data
you need to obtain. We expect that you have been approved to access
the data and are familiar with the data portals of ADNI and UK Biobank.
- Log in to the ADNI Data Portal.
- Download
ADNIMERGE.CSV
,APOERES.csv
andUPENNBIOMK_MASTER.csv
. - Copy
APOERES.csv
to thedata/
directory. - Use
ABETA
,PTAU
,TAU
fromUPENNBIOMK_MASTER.csv
to determine which patients have an Alzheimer's pathologic by creating a columnATN_status
that describes the A/T/N scheme, e.g.A+/T+/N-
ifABETA ≤ 192
,PTAU ≥ 23
, andTAU < 93
, following the thresholds from Ekman et al., 2018:
The individual CSF values were considered pathological (+) if ≤192 pg/ml for Aβ42, ≥93 pg/ml for t-tau, and ≥23 pg/ml for p-tau.
- Download T1 structural brain MRI from the ADNI Data Portal and segment each with FreeSurfer 5.3 to obtain volume and thickness measurements.
- Fill in the values of
data/adni-data-template.csv
by takingABETA
,PTAU
,TAU
fromUPENNBIOMK_MASTER.csv
,ATN_status
from above, volume and thickness measurements computed by FreeSurfer, and the remaining variables fromADNIMERGE.CSV
. Save the resulting file asdata/adni-data.csv
.
- Log in to the UK Biobank Access Management System.
- Download data on Sex, and Age at first imaging visit.
- Download T1 structural brain MRI and segment each with FreeSurfer 5.3 to obtain volume measurements.
- Fill in the values of the
data/ukb-data-template.csv
and save the result asdata/ukb-data.csv
.
- Make sure your created
data/adni-data.csv
as outlined above. - The workflow is split into 3 notebooks that have to be executed sequentially. Begin by starting the Jupyter notebook server:
docker run --rm -p 8888:8888 -v $(pwd)/data:/notebooks/data causalad
-
Go to https://localhost:8888 or click on the link that is printed when running the above command.
-
Click on the
adni-estimate-substitute.ipynb
notebook, which will open a new tab. In the menu at the top, go to Cell and click Run All. Once completed, this will createdata/adni-transformed.csv
and 4 files indata/outputs/adni/dim6/
:adni_aug_BPMF.csv
: Transformed features with 6 substitute confounders estimated by BPMF.adni_aug_PPCA.csv
: Transformed features with 6 substitute confounders estimated by PPCA.adni_aug_regressout.csv
: Transformed features with observed confounders regressed out by linear regression.pvalue.csv
: Bayesian p-value of posterior predictive check for BPMF and PPCA.
-
Go back to https://localhost:8888, and click on the
adni-estimate-effects.ipynb
notebook. In the new tab, go to Cell and click Run All. Once completed, 4 additional files are created indata/outputs/adni/dim6/
:coef_bpmf.csv
: Estimated coefficients of Beta-regression model when accounting for observed confounders and 6 substitute confounders estimated by BPMF.coef_noconf.csv
: Estimated coefficients of Beta-regression model when ignoring confounding.coef_ppca.csv
: Estimated coefficients of Beta-regression model when accounting for observed confounders and 6 substitute confounders estimated by PPCA.coef_regout.csv
: Estimated coefficients of Beta-regression model when accounting for observed confounders via the regress-out approach.
-
Finally, go back to https://localhost:8888 and open the
adni-compare-effects.ipynb
notebook. In the new tab, go to Cell and click Run All. This will display a figure comparing the estimated credible intervals for each model from the previous step. In addition, it writes the figure todata/outputs/adni/dim6/coef-horizontal.pdf
.
- Make sure your created
data/ukb-data.csv
as outlined above. - To execute all steps of the simulation study, you will need at least 64GB of RAM. The entire process takes approximately 4 hours and can be started by executing:
docker run --rm -v $(pwd)/data:/notebooks/data causalad ./run-synthetic-ukb.sh
-
This will create 4 files in
data/outputs/synthetic_ukb/
.augmented_data_bpmf_dim5.pkl
: Transformed features with 5 substitute confounders estimated by BPMF.augmented_data_ppca_dim5.pkl
: Transformed features with 5 substitute confounders estimated by PPCA.coefs_dim5.pkl
: Estimated coefficients of all models for 1000 different simulated outcomes.evaluation_coefs_dim5.csv
: RMSE of coefficients with respect to true coefficients for each model, across 1000 simulations.