I chose this datasets to work with because imbalanced-data really challanging to work with and at the same time to boost my knowldge and skills as a data scentist and push my computer hardware to the limits.
In real life imbalanced datasets can be found in many senario for example- fraud detection, cancer detection, manfacturing defects, online ads conversion etc. Table of contents
- Problem statement and hypothesis Generation
- Data Exploration
- Data Cleaning
- Missing value imputaion
- Data Mainpulation & Feature Enineering
- Machine learning
Imbalanced Techniques
- Oversampling Techniques
- Undersampling Tecniques
- SMOTE
Naive Bayes
XgBoost:
- Homework-Top 20 features
- AUC Threshold . SVM Homework-Class weights