daniEL2371 / abtest-mlops

Design a reliable hypothesis testing to test if the ads that the advertising company runs resulted in a significant lift in brand awareness. Used hypothesis A/B testing to test if a creative ad campaign resulted in a significant lift in brand awareness. Applied machine learning approach for A/B testing and compared its result with the hypothesis A/B testing result.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

abtest-mlops

Table of content

Overview

SmartAd is an advertiser agency. It designs intuitive touch-enabled advertising. It provides brands with an automated advertising experience via machine learning and creative excellence. SmartAd has implemented a new advertising campaign and collected BIO brand impact optimizer data from 3-10 July 2020 to determine the impact of the creative Ad they design.

The objective is to design a reliable hypothesis testing algorithm for the BIO service and to determine whether a recent advertising campaign resulted in a significant lift in brand awareness.

This repo demonstrate how to apply classic A/B testing, Sequential A/B testing and Machine learning Approach to A/B testing in order to determine if a recent campaign resulted in a significant lift in brand awareness

Requirements

Python 3.5 and above, Pip and MYSQL

Install

git clone https://github.com/daniEL2371/abtest-mlops
cd abtest-mlops
pip install -r requirements.txt

Hyperparameter tuning

cd scripts
python dt_tune_train.py
python lr_tune_train.py
python xgb_tune_train.py

Model tracking

cd notebooks
mlflow ui

Features

Data

  • The data collected was collected from a BIO questioner, which asks users if they are aware of the brand called lux
  • there are four versions of data. data versioning is implmented using dvc.

Notebooks

Classic A/B hypothesis testing

  • Classic A/B hypothesis testing is implemented inside notebooks/ClassicABTest.ipynb

Sequential A/B hypothesis testing

  • Sequential A/B hypothesis testing is implemented inside notebooks/Sequential.ipynb

ML AB Testing

  • A machine learning approach to implement A/B testing by calculating feature importance of variables is implemented inside notbooks/ML_AbTest.ipynb
  • The notebook demonstrates how to use Logistic Regression, Decision Tree and Gradient Boosting Models to calculate feature importance.
  • we used two kinds of data versions for training our models. One version contains a column called platform_os and the othe data version contains a column called browser. this is donr in order to determine the effects of each column on the models.
  • Then the resulting 6 models are stored in the folder called models.
  • The note books also imports hyperparameter tuning scripts from the scripts folder.

Models

All models that are trained are saved inside models folder

Scripts

  • Utility helper functions is implemented in helper.py module
  • Logger class for the project is implemented in app_logger.py module
  • Plotting graphs like scatter plot, histogram, distribution graph, heat map, bar plot, and count plot is is implemented in scripts/plots.py module
  • A custome DecisionTree Model that extends sklearn's DecisionTree Model for A/B testing is implemeted in decisionTreesModel.py module
  • A custome LogisticRegression that extends sklearn's LogisticRegression Model for A/B testing is implemeted inside logesticRegressionModel.py
  • A custome xGBClassifierModel that extends sklearn's GradientBoostingClassifier Model for A/B testing is implemeted inside xGBClassifierModel.py
  • xgb_tune_train.py, lr_tune_train.py and dt_tune_train.py are a hyper parameter tuning modules for xGBClassifier Model, logisticRegression Model and DecisionTree Model respectively

About

Design a reliable hypothesis testing to test if the ads that the advertising company runs resulted in a significant lift in brand awareness. Used hypothesis A/B testing to test if a creative ad campaign resulted in a significant lift in brand awareness. Applied machine learning approach for A/B testing and compared its result with the hypothesis A/B testing result.


Languages

Language:Jupyter Notebook 98.8%Language:Python 1.2%