DanEscasa / ExData_Plotting1

Plotting Assignment 1 for Exploratory Data Analysis: electric consumption

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

title author date output
README
Daniel Escasa
February 1, 2021
html_document
number_sections
true

Introduction

This assignment uses data from the UC Irvine Machine Learning Repository, a popular repository for machine learning datasets. In particular, we will be using the “Individual household electric power consumption Data Set” which the professor has made available on the course web site:

  • Dataset: Electric power consumption [20Mb]

  • Description: Measurements of electric power consumption in one household with a one-minute sampling rate over a period of almost 4 years. Different electrical quantities and some sub-metering values are available.

The following descriptions of the 9 variables in the dataset are taken from the UCI web site:

  1. Date: Date in format dd/mm/yyyy
  2. Time: time in format hh:mm:ss
  3. Global_active_power: household global minute-averaged active power (in kilowatt)
  4. Global_reactive_power: household global minute-averaged reactive power (in kilowatt)
  5. Voltage: minute-averaged voltage (in volt)
  6. Global_intensity: household global minute-averaged current intensity (in ampere)
  7. Sub_metering_1: energy sub-metering No. 1 (in watt-hour of active energy). It corresponds to the kitchen, containing mainly a dishwasher, an oven and a microwave (hot plates are not electric but gas powered).
  8. Sub_metering_2: energy sub-metering No. 2 (in watt-hour of active energy). It corresponds to the laundry room, containing a washing-machine, a tumble-drier, a refrigerator and a light.
  9. Sub_metering_3: energy sub-metering No. 3 (in watt-hour of active energy). It corresponds to an electric water-heater and an air-conditioner.

Loading the data

The most important consideration in manipulating the dataset is that the origin is UCI, and that therefore times are in that time zone (PST8PDT). Times encoded as “00:00:00” transform — in my case — to “16:00:00” PST(GMT+0800). Failure to adjust to PST8PDT results in graphs different from those in the repository from which this one was forked. This, however, was not necessary for plot1.R since neither of the horizontal or vertical axes were time-based.

In that sense, this is an extension of the Data Cleaning Course.

Making Plots

Our overall goal here is simply to examine how household energy usage varies over a 2-day period in February, 2007. Our task is to reconstruct the following plots below, all of which were constructed using the base plotting system.

For each plot we should

  • Construct the plot and save it to a PNG file with a width of 480 pixels and a height of 480 pixels.

  • Name each of the plot files as plot1.png, plot2.png, etc.

  • Create a separate R code file (plot1.R, plot2.R, etc.) that constructs the corresponding plot, i.e. code in plot1.R constructs the plot1.png plot. The code file should include code for reading the data so that the plot can be fully reproduced. We should also include the code that creates the PNG file.

  • Add the PNG file and R code file to our git repository

The four required plots are shown below.

Plot 1

plot of chunk unnamed-chunk-2

Plot 2

plot of chunk unnamed-chunk-3

Plot 3

plot of chunk unnamed-chunk-4

Plot 4

plot of chunk unnamed-chunk-5

Running the code

The four R scripts all contain a line setting the working directory. Edit that to reflect your own working directory.

Each script defines a function ploti, i in [1:4], to make it easier to run the script. From the R Console, simply type in ploti(), substituting the desired value of i, to run the script. Don't forget the parenthesis! Otherwise, Studio will just list the script.

About

Plotting Assignment 1 for Exploratory Data Analysis: electric consumption


Languages

Language:R 100.0%