dujijundavid / E-Commence

Developing a marketing program targeted to dormant one time buyers on platform to incentivize them to purchase again. Improve the program with machine learning models by at least 80%, with over 2GBs of training data processed on single machine. Implement classification models including Lasso logistic regression and Random Forest with cross validation optimization. Using R&SQL for data importing, cleaning and reprocessing, model testing and optimization.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

E-Commence Dormant User Activation

data tables exceeds the upload file size limit. shareable link: https://drive.google.com/open?id=1xLkW5VLmFDvB9vU2wtaG-OZTiSAQaLey

Objective: Develop a marketing program targeted to dormant one time buyers on platform to incentivize them to purchase again. Improve the program with machine learning models by 80%.

Works: Implement classification models including Lasso logistic regression and Random Forest with cross validation optimization. Measurement on ROC curve and AUC score and compare to baseline approach (random guessing).

How the files works?

E-commence analysis.rmd is the main file, including data importing, cleaning and reprocessing, model testing and optimization with R, using SQL in R (with 'sqldf' package)

others code: remove_outlier.R # EDA.R # Some random trials to better understand the dataset, for Visualizations.R # My visualization for model performance & Visualization tutorial samples

other files: r.data load subsets of data consisting of 10,000 rows for customer, product, order tables.

About

Developing a marketing program targeted to dormant one time buyers on platform to incentivize them to purchase again. Improve the program with machine learning models by at least 80%, with over 2GBs of training data processed on single machine. Implement classification models including Lasso logistic regression and Random Forest with cross validation optimization. Using R&SQL for data importing, cleaning and reprocessing, model testing and optimization.


Languages

Language:R 100.0%