RamEppala / imbalanceddatasetproject

Machine Learning Project on Imbalanced Data in R

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

imbalanceddataproject

Project name: Machine Learning Project on Imbalanced Data

Description: The data used in this project is imbalanced. In real life, some extremely critical situations result in imbalanced data sets. For example – fraud detection, cancer detection, manufacturing defects, online ads conversion etc

some characteristics of this project: The data is fairly large and high dimensional.This project will demonstrate data analysis and machine learning skills.

Table of Contents: Problem Statement & Hypothesis Generation Data Exploration Data Cleaning Missing Value Imputation Data Manipulation a.k.a Feature Engineering Machine Learning Imbalanced Techniques-Oversampling, Undersampling, SMOTE naive Bayes XgBoost–Top 20 Features,AUC Threshold SVM– Class weights

Installation: 1. download R it is opensource 2. open R studio and upload imbalancedproject folder.

Prerequisites: An Intel-compatible platform running Windows 2000, XP/2003/Vista/7/8/2012 Server/8.1/10.

At least 32 MB of RAM, a mouse, and enough disk space for recovered files, image files, etc.

The administrative privileges are required to install and run R-Studio utilities under Windows 2000/XP/2003/Vista/7/8/2012 Server/8.1/10.

A network connection for data recovering over network.

Installing R under Windows:

The bin/windows directory of a CRAN site contains binaries for a base distribution and a large number of add-on packages from CRAN to run on 32- or 64-bit Windows (Windows 7 and later are tested; XP is known to fail some tests) on ‘ix86’ and ‘x86_64’ CPUs.

Your file system must allow long file names (as is likely except perhaps for some network-mounted systems). If it doesn’t also support conversion to short name equivalents (a.k.a. DOS 8.3 names), then R must be installed in a path that does not contain spaces.

Installation is via the installer R-3.4.3-win.exe. Just double-click on the icon and follow the instructions. When installing on a 64-bit version of Windows the options will include 32- or 64-bit versions of R (and the default is to install both). You can uninstall R from the Control Panel.

Note that you will be asked to choose a language for installation, and that choice applies to both installation and un-installation but not to running R itself.