Ocanamat / courseProject_GetCleanData

A repo containing an R script for performing the required analysis of the course project

Home Page:https://class.coursera.org/getdata-032/human_grading/view/courses/975116/assessments/3/submissions

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Getting and Cleaning Data: Course Project

This repository contains the script run_analysis.R, as required for the "Getting and Cleaning Data" Course Project, part of the Data Science Specialization

Synopsis

run_analysis.R will create a series of dataFrames and tbls in order to:

1. Create a training and test dataSets with the feature vectors from the raw data.
2. Add the subject IDs and activity names to the created train and test dataSets.
3. Merge the training and the test dataSets to create one data set.
4. Beautify the columNames in order to make them valid for dplyr manipulation.
5. Extract only the measurements on the mean and standard deviation for each measurement. 
6. Rename Activity names to descriptive (readable) name the activities in the data set.
and finally
7. Creates "tidyData_HumanActivity", an independent tidy data set with the average of each 
	variable for each activity and each subject.

Requirements

A few assumptions are made by this script:

1. A folder named "UCI HAR Dataset" exists within the working directory from which this script 
is run. This folder must contain the outputs of unzipping the raw data extracted from 
https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

2. Packages "dplyr", "tidyr", "plyr" are installed in the host machine

3. Working directory must be set prior to running the script 

Code Example

library(dplyr) library(tidyr) library(plyr) run_analysis.R

About

A repo containing an R script for performing the required analysis of the course project

https://class.coursera.org/getdata-032/human_grading/view/courses/975116/assessments/3/submissions


Languages

Language:R 100.0%