karlosagudo / DataScienceProject3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

README

This is the project assignment for data course from cousera : "Getting and cleaning data"

Original data: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip

IN order to work please extract the two folders train and test in this same folder.

Original description: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones

The attached R script (run_analysis.R) performs the following to clean up the data:

  • Merges the training and test sets to create one data set, namely train/X_train.txt with test/X_test.txt, the result of which is a 10299x561 data frame, as in the original description ("Number of Instances: 10299" and "Number of Attributes: 561"), train/subject_train.txt with test/subject_test.txt.

  • Reads features.txt and extracts only the measurements on the mean and standard deviation for each measurement.

  • Reads activity_labels.txt and applies descriptive activity names to name the activities in the data set:

      walking
      
      walkingupstairs
      
      walkingdownstairs
      
      sitting
      
      standing
      
      laying
    
  • The script also appropriately labels the data set with descriptive names: all feature names (attributes) and activity names are converted to lower case, underscores and brackets () are removed.

      tbodyacc-mean-x 
      
      tbodyacc-mean-y 
      
      tbodyacc-mean-z 
      
      tbodyacc-std-x 
      
      tbodyacc-std-y 
      
      tbodyacc-std-z 
      
      tgravityacc-mean-x 
      
      tgravityacc-mean-y
    
  • The script creates a 2nd, independent tidy data set with the average of each measurement for each activity and each subject. The result is saved as tidy_Data_set.txt.

About


Languages

Language:R 100.0%