llcorrea / income_census_prediction

Data science project for feature engineering and classification using as case study the Census Income dataset

Repository from Github https://github.comllcorrea/income_census_predictionRepository from Github https://github.comllcorrea/income_census_prediction

Income Prediction based on US Census Data

Data science project of feature engineering and classification tasks.

Given the Income Census dataset, the goal is to accomplish some tasks on feature engineering and then apply some machine learning (ML) algorithms for classification purpose of census public data.

Dataset Information:

Census Income Dataset: http://archive.ics.uci.edu/ml/datasets/Census+Income

Extraction was done by Barry Becker from the 1994 US Census database. A set of reasonably clean records was extracted using the following conditions: (AAGE>16) and (AGI>100) and (AFNLWGT>1) and (HRSWK>0).


Task's description

The prediction task is to determine whether a person makes over $50K a year (income exceeds $50K/yr) based on census data. Also known as "Adult" dataset.

About

Data science project for feature engineering and classification using as case study the Census Income dataset


Languages

Language:Jupyter Notebook 100.0%