This is a data analytics project where the task is to determine customer segmentation. I've collected the data from first technical assessment from TidyQuant company. I gave my best and applied many logics and approaches for that task. Unfortunately I failed in second technical assessment :( So decided to scale up my analytical skills and here is my project for analytics.
- As I told you I've collected a data via assessment.
- Data was very raw it contained just numbers and it didn't contain any label (Unsupervised data)
- Found attributes of the data such as null values, mean of total order, revenue, etc. using Pandas.
- Analyzed the patterns/trends present in the dataset through univariate and bivariate analysis.
- Found types of customers using Recency, Frequence and Monetary(RFM) model. Here I've search lots of about this RFM techniques from investopedia.com and got reference from GeeksForGeeks.
- Used matplotib and seaborn to find insights of revenue, orders, momthly revenue/orders insights in the part of Exploratory Data Analysis.
- Performed data storytelling methodolgy after very insight (you can find this in .ipynb file uploaded above).
- Cleaned the data and applied feature selection(heatmap) technique for better prediction.
- To built machine learning pipline for predicting the types of customers based on features I've scaled the data and trained on multi-label classfication data (which mainly based on RFM model [Champions, Potential Customers, Need Attention] ).
- Handled imbalacing of the data using Over-Sampling(SMOTE) technique.
- For better performance I've trained the data on cross-validation and hyperparameter tuning techniques.
- As is a multi-class based problem I've used KNN(93%) and Naive Bayes(87%). Evaluated the using confusion matrixx and classification report.
- Build an insightful Tableau Dashboard based on cleaed and well structured data.