getdata

Assignment:
You should create one R script called run_analysis.R that does the following.

Merges the training and the test sets to create one data set.
Extracts only the measurements on the mean and standard deviation for each measurement.
Uses descriptive activity names to name the activities in the data set
Appropriately labels the data set with descriptive variable names.
Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

This README file explains:

run_analysis.R

getData1.R

creates the required tidy data set from raw data
requires raw data, unaltered data and unaltered file structure to be in same directory as script

For training and test sets separately, convert y_train/y_test from activity label code to activity label text.
For training and test sets separately, extact mean and std columns from X_train/X_test files.
For training and test sets separately, join columns of subject_train (subject numbers), y_train (activity label), X_train (training set data - features).
Join rows of training and test dataframes to create the final tidy dataset.
Write the final tidy dataset to output files called output1.csv and output1.txt (two file formats of same data).

input files for getData1.R

'features.txt': List of all features. dimensions 561 x 2
'activity_labels.txt': Links the class labels with their activity name. dimensions 6 x 2
'train/X_train.txt': Training set. dimensions: 7352 x 561 (fixed width and space delimited)
'train/y_train.txt': Training labels. dimensions: 7352 x 1 = coded activity labels
'train/subject_train.txt': Each row identifies the subject who performed the activity for each window sample. Its range is from 1 to 30. dimensions: 7352 x 1
'test/X_test.txt': Test set. dimensions: 2947 x 561 (fixed width and space delimited)
'test/y_test.txt': Test labels. dimensions: 2947 x 1
'test/subject_test.txt' Each row identifies the subject who performed the activity for each window sample. Its range is from 1 to 30. dimensions: 2947 x 1

getData2.R

Computes the mean of each variable in tidydf for each activity and each subject
Writes the means from #1 to output file (two file formats of same data): output2.csv and output2.txt

kelleymic / getdata