Pooling datasets from multiple studies can significantly improve statistical power: larger sample sizes can enable the identification of otherwise weak disease-specific patterns. When modern learning methods are utilized (e.g., for predicting progression to dementia), differences in data acquisition-methods / scanner-protocols can enable the model to “cheat”, i.e. utilizes site-specific artifacts rather than disease-specific features. In this study, we develop a method to harmonize the performance of DNN classifiers across scanners/sites, via so-called fairness constraints, thereby encouraging consistent behavior while controlling for site-specific nuisance variables.
Code will be made available upon request.
The slides are available in the main directory with the title slides_multisitepooling.pdf
. We have a video going over the slides on youtube at this link https://youtu.be/xEgcujq2QmU.