dilayercelik / Arvato-MLProject

Udacity Machine Learning Engineer Nanodegree Capstone on customer segmentation and acquisition with Arvato Bertelsmann Financial Solutions.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Arvato-MLProject: PASSED (Sept 2020)

This GitHub repository hosts the Capstone project I have developed and completed as part of the Udacity Machine Learning Engineer Nanodegree.

In this project, I have worked on 4 demographics datasets provided by Arvato Financial Services, with the intermediary goal of extracting similarities/differences between the general population and the current customer base of a German-based company, in order to predict which individuals are more likely to become new customers (individuals who could then be targeted by the mail-order company campaign).

This project employs both unsupervised (PCA and Dimensionality Reduction, k-Means Clustering and Customer Segmentation) and supervised (from scikit-learn...) machine learning algorithms and techniques.

Proposal Review

You can access the mentor review I have received for my Proposal submission (see folder), here.

Alternatively, you can access a pdf version in this repo, here.

Project Notebook Review (23/09/2020)

You can access the mentor review I have received for my Project submission, here.

Alternatively, you can access a pdf version in this repo, here.

Table of Contents

Requirements

The Jupyter Notebook is written in Python (3.x. version required).

This project requires you to install the listed libraries in the requirement.txt file and Anaconda distribution Python 3.6

The main packages used are:

numpy: scientific computing tools

pandas: data structures and data analysis tools

matplotlib: data visualisation tools

seaborn: data visualisation tools

scikit-learn (sklearn): Machine Learning library in Python

Results

You can have a look at the Leaderboard of the Kaggle Competition (as of now, I stand, with my first and only submission kaggle.csv file, as 174th out of 270 participants).

The AUC of the ROC Curve score I obtained with my only submission is: 0.74612

Acknowledgements

I would like to thank everyone involved in presenting me this Capstone Project, both at Udacity (for their support) and at Arvato Financial Services (for letting us access their private data).

Have a look at the Machine Learning Engineer Nanodegree, offered by Udacity (via the School of AI), here.

The syllabus for this online programme can be found here.

Author

About

Udacity Machine Learning Engineer Nanodegree Capstone on customer segmentation and acquisition with Arvato Bertelsmann Financial Solutions.


Languages

Language:HTML 63.6%Language:Jupyter Notebook 36.4%