This repository is the official PyTorch implementation of "Handling Missing Data with Graph Representation Learning". [Project webpage].
Jiaxuan You*, Xiaobai Ma*, Daisy Ding*, Mykel Kochenderfer, Jure Leskovec, Handling Missing Data with Graph Representation Learning, NeurIPS 2020.
GRAPE is a framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks.
At the root folder:
conda env create -f environment.yml
conda activate grape
Install PyTorch
Install PyTorch_Geometric
To train feature imputation on uci datasets:
# UCI: concrete, energy, housing, kin8nm, naval, power, protein, wine, yacht
python train_mdi.py uci --data concrete
To train label prediction on uci datasets:
# UCI: concrete, energy, housing, kin8nm, naval, power, protein, wine, yacht
python train_y.py uci --data concrete
To train feature imputation on Flixster, Douban and YahooMusic:
# flixster, douban, yahoo_music
python train_mdi.py mc --data flixster
The results will be saved in "uci/test/dataset_name" or "mc/test/dataset_name". For more training options, look at the arguments in "train_mdi.py" and "train_y.py" as well as "uci/uci_subparser.py" and "mc/mc_subparser.py".
If you find this work useful, please cite our paper:
@article{you2020handling,
title={Handling Missing Data with Graph Representation Learning},
author={You, Jiaxuan and Ma, Xiaobai and Ding, Daisy and Kochenderfer, Mykel and Leskovec, Jure},
journal={NeurIPS},
year={2020}
}