geeky-bit / Census-Data-Imbalanced-Data-Problem

This is a simple Imbalanced dataset handling problem where I have used Census Data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Handling Imbalanced-Data-Problem using Census Data

The data used in this example is imbalanced, fairly large and high dimensional. The basic purpose of this example is to show how to handle Imbalanced datasets. This is a fairly simple approach (one of the many).

In this project, following tasks are performed :

  • Data Exploration
  • Data Cleaning
  • Feature Engineering

Techniques used -

  • Oversampling
  • Undersampling
  • SMOTE

ML algos :

  • Naives Bayes
  • XGBoost

Download dataset : http://archive.ics.uci.edu/ml/machine-learning-databases/census-income-mld/

About

This is a simple Imbalanced dataset handling problem where I have used Census Data


Languages

Language:R 100.0%