rocksailor / Exploratory-Data-Analysis-and-Prediction-on-Diabetes-Dataset-using-R

This project first conducts Exploratory Data Analysis (EDA) and data visualization on the diabetes dataset and then predict the disbetes using machine learning.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploratory-Data-Analysis-and-Prediction-on-Diabetes-Dataset-using-R

This project first conducts Exploratory Data Analysis (EDA) and data visualization on the diabetes dataset and then predict the disbetes using machine learning.

Dataset

Diabetes data can be downloaded from

http://biostat.mc.vanderbilt.edu/wiki/Main/DataSets?CGISESSID=10713f6d891653ddcbb7ddbdd9cffb79

Exploratory Data Analysis (EDA)

  1. Descriptive statistics

attribute type, class distribution, mean, stadard deviation, median, quartile, Skewness, correlation

  1. Data visualization

Histogram plot

Density plot

Box and Whisker plot

Bar plot

Missing data map

Pair-wise correlation plot

Prediction on Diabetes

We compare the performance for the following classifiers:

  1. Logistic Regression

  2. Support Vector Machine (SVM)

  3. random Forest

About

This project first conducts Exploratory Data Analysis (EDA) and data visualization on the diabetes dataset and then predict the disbetes using machine learning.


Languages

Language:R 100.0%