This is the code implementation for the paper Uplift Modeling based on Graph Neural Network Combined with Causal Weighting
A bridge between Causal Inference, Machine Learning and Personalization---Uplift Modelling
Uplift modeling is a statistical technique used to determine the effect of a treatment (marketing campaign/offer/discount) on a particular outcome of interest. It is commonly used in marketing and customer relationship management to identify the specific customers or groups of customers who are most likely to respond positively to a given treatment or intervention.
Uplift modeling involves building a statistical model that predicts the difference in the outcome of interest between the treatment and control groups. The model is trained on data from a controlled experiment or observational study, in which a treatment group is exposed to the treatment and a control group is not. The classic machine learning propensity models predict the target variable (Y) given input variables (X). While the uplift model explains the impact of treatment(marketing campaign/offer/discount) on target variable (Y) given input features (X).
Potential Outcome Framework and Causal Effect
In the potential outcome framework, the causal effect of a treatment can be estimated by comparing the observed outcomes in the treatment group to the observed outcomes in the control group. Noted that this estimate is only valid if the treatment has been randomly assigned, and if the control group is representative of what the outcome would have been in the treatment group in the absence of the treatment. This is known as the "counterfactual" or "potential outcome" for the treatment group.
GNN Uplift Modeling Workflow
Causal Weighting Workflow
The following are the major open source packages utilised in this project:
- DoWhy
- Gensim
- EconML
- CausalML
- PyTorch Geometric
.
βββ notebooks
β βββ 0_ate_weighting.ipynb
β βββ 1_feature_correlations_BNEstimator.ipynb
β βββ 2_node_embeddings.ipynb
β βββ 3_causal_weighting_embedding(10d)_ate(13d).ipynb
β βββ 3_causal_without_memory.ipynb
β βββ README.md
β
βββ synthetic # synthetic experiments
β βββ synthetic_data
β β βββ data.csv # synthetic dataset with 5 confounders
β β βββ data_10000_9.csv # synthetic dataset with 9 confounders
β β βββ data_10000_20.csv # synthetic dataset with 20 confounders
β β βββ README.md
β β
β βββ synthetic_results
β β βββ data_10000_x5 # folder containing detailed uplift modelling for synthetic dataset with 5 confounders
β β βββ data_10000_x9 # folder containing detailed uplift modelling for synthetic dataset with 9 confounders
β β βββ data_10000_x20 # folder containing detailed uplift modelling for synthetic dataset with 20 confounders
β β βββ README.md
β β
β βββ README.md
β
β
βββ README.md
Follow notebooks in Google Colab
Experiments on more benchmark datasets
Experiments to validate scalablility of this causal knowledge-enabled GNNs
If you have any questions or suggestions towards this repository, feel free to contact me at xy2119@ic.ac.uk.
Any kind of enhancement or contribution is welcomed!