Checkpoint 2
You can download a template RMarkdown file to start from here.
This work should be done individually meaning that each person should have their own meter name. You should support each other in your work but make sure that what you type in this document reflects your own words and ideas.
Macalester Energy Use
Go to Moodle Checkpoint 2 assignment and download the dataset, Macalester_Electricity_Use.csv
.
This dataset, provided by the Sustainability Office, includes the electricity meter data on properties that Macalester College owns. Note there are three types of electricity meters: Electric - Grid, Electric - Solar, Electric - Wind. This data is originally recorded as a monthly usage with a start and end date. Brianna has done some initial cleaning to calculate the average usage per day to account for different month lengths. Below are the definitions of some of the variables in the dataset.
start_date
: The first date of the collection period
property_name
: The name of the property (26 unique property names)
property_type
: Type of property (9 unique property types)
meter_name
: The name of the meter (36 unique meter names)
meter_type
: The type of the meter (3 unique meter types)
use_per_day
: The average electricity usage per day within the collection period (calculated by taking total usage / days in collection period).
gross_floor_are
: The square footage of area in the property
street_address
: Address of the property
- Download the data. Put the file in a known location (the same folder that this Rmd is in) on your computer. Look at the csv file and then read in the data into R.
- Choose a meter from the 36 options with a non-constant trend and seasonality. Tell me what that Property Name and Meter Name is. Make sure you choose a meter with at least 90 observations (some meters are relatively new; it is harder to estimate a trend and seasonality). Visualize and explore a few before making a decision.
ANSWER:
- Come up with at least two estimates of the trend of the usage. Plot both of those estimates as a function of time. Justify which estimate you prefer.
ANSWER:
- Plot the de-trended series (residuals = original outcome data - estimated trend). Comment on the de-trended series.
ANSWER:
- Estimate the seasonality of the de-trended series. Plot the average cycle. Comment on the average cycle.
ANSWER:
- Plot the de-trended series after removing the seasonality (so you are left with the errors). Comment on what you observe.
ANSWER:
- Estimate the autocorrelation of the errors (using
acf()
assuming stationarity) and comment on what you observe about the errors.
ANSWER:
- Predict the average daily meter usage based on the trend and seasonality for the next 5 months. We haven’t formally learned how to do this but think about how you might do this; be creative. You goal is to see if you can create a plot of the original data and add a red line that shows this prediction. Do your best.
ANSWER:
- Briefly reflect on what you’ve learned about Macalester’s energy usage and the questions you have about the data generation process.
ANSWER: