GitHub - Arun-Kaushik/Cleaning_Data: Peer Assignment on Getting and Cleaning Data

Getting and Cleaning Data Project

Download and unzip the data set
Set working directory to this folder in R-Studio.
Run run_analysis.R in R-Studio

Returns one data set by reading and merging all component files. Data set comprises of the X values, Y values and Subject IDs. The path_prefix indicates the path where the data files can be found. The fname_suffix indicates the file name suffix to be used to create the complete file name.

About

A simple R script to merge, clean, and summarize the Human Activity Recognition Using Smartphones data set.

Written for the April 2014 Getting and Cleaning Data course offered by Johns Hopkins University through Coursera.

Usage

Get run_analysis.R on your local machine using whatever method suits you.
In R, set your working directory to the directory that contains run_analysis.R.
Download the data set.
Extract the "UCI HAR Dataset" directory into the same directory as run_analysis.R.

Your working directory should contain both run_analysis.R and the UCI HAR Dataset directory.

Execute the script from the R command line with source("run_analysis.R")

Outputs and Variables

mergedData - A data.table containing the merged and cleaned data set.
tidyData - A data.table with the average (mean) value of the mean and standard deviation of each measurement, for each subject and activity.
tidy.txt - A text file containing tidyData.

Details

The script performs the steps below to produce a tidy data set with the mean of each std() and mean() feature for each activity and subject, and writes that to the file tidy.txt.

After running, the merged data can be referenced through the mergedData variable, and the summary data through the tidyData variable, both of which are of type data.table.

Merging the data into mergedData

Combines the training and test feature (X_train.txt and X_text.txt) data from the UCI HAR Dataset directory into one data.table, mergedData.

The Inertial Signals data is not used.

Applies the names in features.txt to the columns of mergedData.
Adds two columns to mergedData
activity - from y_train.txt and y_test.txt files.
subject.id - subject_train.txt and subject_test.txt files.
Replaces the activity column values with the corresponding labels defined in activity_labels.txt.

Creating a tidy summary data set

Melts mergedData using the activity and subject.id columns for id variables.
Casts the molten data, by activity and subject.id, using mean as the aggregate function. The result of this cast operation is stored in the data.table variable tidyData.
Writes the cast summary data to tidy.txt in the current working directory.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
CodeBook.md		CodeBook.md
README.md		README.md
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting and Cleaning Data Project

About

Usage

Outputs and Variables

Details

Merging the data into mergedData

Creating a tidy summary data set

About

Releases

Packages

Languages

Arun-Kaushik/Cleaning_Data

Folders and files

Latest commit

History

Repository files navigation

Getting and Cleaning Data Project

About

Usage

Outputs and Variables

Details

Merging the data into mergedData

Creating a tidy summary data set

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages