xuqiang / hadoop-naive-bayes

Project on Apache Hadoop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Naive Bayes on Hadoop

This project use the dataset "Cencus Income" from the UCI repository, containing data about the prediction of income of some people.

The dataset can downloaded from here (specifically, the file adult.data).

Project Description

For this project we implement a Naive Bayes classifier on Hadoop and test it on “Cencus Income” dataset.

Preprocess

For the numerical attributes we implemented a discretization with MapReduce on Hadoop.

About

Project on Apache Hadoop

License:Apache License 2.0


Languages

Language:Java 100.0%