muj1og / Imbalanced-Techniques-project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Imbalanced-Data

I chose this datasets to work with because imbalanced-data really challanging to work with and at the same time to boost my knowldge and skills as a data scentist and push my computer hardware to the limits.

In real life imbalanced datasets can be found in many senario for example- fraud detection, cancer detection, manfacturing defects, online ads conversion etc. Table of contents

  1. Problem statement and hypothesis Generation
  2. Data Exploration
  3. Data Cleaning
  4. Missing value imputaion
  1. Data Mainpulation & Feature Enineering
  2. Machine learning

Imbalanced Techniques

  1. Oversampling Techniques
  2. Undersampling Tecniques
  3. SMOTE

Naive Bayes

XgBoost:

  1. Homework-Top 20 features
  2. AUC Threshold . SVM Homework-Class weights

About

License:MIT License