Homework 4

You can download a template file to start from here.

Github Setup

Github is an online storage location (like Google Drive) that can do version history for code files in a more elegant manner than Google Drive. This can facilitate efficient collaboration. But you need to keep in mind that you have a “local version” on your computer and in order for your collaborator to see your updates, you must make a commit (snapshot of the file) and push the file to Github (in the cloud). In order to see any updates your collaborator makes, you must pull the changes from Github (from the cloud) to your local machine.

To commit, push, and pull, I recommend using Github Desktop.

To create a shared repository (for you and your partner + me), each person should go to https://classroom.github.com/a/fHdgI0Cv. One of you create a new team and name it TS-Name1-Name2, filling in you and your partner’s name for Name1 and Name2. Once the team is created, the other person can join that team.

Once that repository is created, clone the repository to your local machine. I highly recommend that you use Github Desktop to help you set it up.

On your local machine, you should see an empty folder.

One team member:

  • Download the HW4 template file to start from here and put it in this folder.
  • Move a copy of the Macalester Electricity data (that you used for HW3) to this folder.
  • Commit and Push your changes

One other team member:

  • Pull changes after the initial team member set up the folder.
  • If you have three in your team, go through and copy everything with “Person B” and make a version that says “Person C”.
  • Do a Search and Find for “Person A” and replace with the first name of one of your team mates.
  • Do a Search and Find for “Person B” and replace with the first name of your name.
  • If you have three in your team, do a Search and Find for “Person C” and replace with the first name of one of your team mates.
  • Commit and Push your changes.

Everyone on the team pull down changes.

You should work with your time series project partner on the R code, but I want each of you to write your own paragraphs of what you learn. This will be to your benefit to have separate insights when you work on writing up the mini-project.

Submission:

  • Render your Quarto document to html
  • Commit and push your changes to Github.

Revisit HW 3

  1. With your partner, decide on one meter on Macalester’s campus you’d like to explore. You may choose one of the meters you used individually or choose another. Please list the Meter Name and Property Name that you together decided on.

ANSWER:

# load in the data & include any data cleaning needed (creating date variables, etc.)
library(tidyverse)
library(astsa)
  1. Individually, do a brief search about electricity usage as it relates to seasonality and sustainability goals on college campuses. Each of you should find 2 unique reputable sources (reputable journal articles or news sources or websites) on the topic. Each individual should write a short paragraph introducing general topic of electricity use and why you think it is interesting and important to investigate the usage over time. Think of this as a draft of your introductary paragraph.

ANSWER:

Source URL 1 [Person A]:

Source URL 2 [Person A]:

Paragraph [Person A]:

Source URL 1 [Person B]:

Source URL 2 [Person B]:

Paragraph [Person B]:

Visualize

  1. Together, create a variety of visualiations of the time series. Each person should choose one to finalize and make professional that you think is interesting and informative visualizations of the time series. For your visualization, write a paragraph summarizing what you learn about the data from the visualization. Each person should have one visual and one paragraph.
# Plot 1

Paragraph of Plot 1 [Person A]:

# Plot 2

Paragraph of Plot 2 [Person B]:

Detrend & Decompose

  1. Together, try at least 2 methods of estimating the trend of the time series data. For the best 2 methods, make a plot of the estimated trend and a plot of the left over residuals. Each person should write a paragraph justify one of the methods you used to estimate the trend and include about what you learn about the data from the visualizations.

ANSWER:

# Method 1

Paragraph [Person A]:

# Method 2

Paragraph [Person B]:

  1. Together, make a plot that shows the seasonality. Then for each of the de-trending methods above, estimate the seasonality and make a plot of the seasonality and a plot of the left over residuals. Justify the method you used to estimate the seasonality and write a brief paragraph about what you learn about the data from the visualizations.

ANSWER:

# Method 1

Paragraph [Person A]:

# Method 2

Paragraph [Person B]:

  1. Now, try going back to the original data and using differencing to remove the trend and seasonality. Make a plot of the left over residuals. Write a brief paragraph about what you learn about the data from the visualization.

ANSWER:

Paragraph [Person A]:

Paragraph [Person B]:

  1. Lastly, plot the sample autocorrelation function and the sample partial autocorrelation function (acf2()) of the errors after removing both the trend and seasonality [choose the errors from differencing or from estimating & removing]. Describe the patterns you see and make comments about any insights you might have about how to go about modeling the errors. The partial autocorrelation function gives the conditional correlation of points lag k apart, conditional on the data in between. [If we haven’t talked about what to do with info we gain from the pacf yet, you can still comment on what you observe].

ANSWER:

Paragraph [Person A]:

Paragraph [Person B]:

Modeling Errors

  1. Come up with a list of candidate models for the errors based on the ACF and PACF. Justify those choices.

ANSWER:

Paragraph [Person A]:

Paragraph [Person B]:

  1. Fit the candidate models for the errors and compare them. Write a paragraph justifying the choice of one model over the other models.

ANSWER:

Paragraph [Person A]:

Paragraph [Person B]:

  1. Now fit your chosen model, incorporating the trend estimation or differencing in the model fit. If you used B-splines or a polynomial linear model, incorporate your estimation of the trend and seasonality into model fit using some example code below (consolidate your lm models into one). If you are using the differenced data, incorporate your differencing through d (trend) and D (seasonality) arguments in sarima(). Rerun the final models.

ANSWER:

# Generating data so the Example Code Runs (Ignore for your work)
time = 1:500
month = (time %% 12) + 1
y <- .5*time + 300*(time == 123) + 50* (month > 6) + arima.sim(list(ar=c(.2,.4)),500)

# Example Code Below (adapt to your trend + seasonality model)
trend.mod <- lm(y ~ time + factor(month) + (time == 123)) # time == 123 deals with an outlier/anomaly
X = model.matrix(trend.mod)[,-1] #-1 removes intercept column

# Adapt to your ARMA(p,q)
sarima(y,p = 2,d = 0,q = 0, xreg = X)

Predicting the Future

Try this out, if you have time, otherwise, you can incorporate this into the mini-project.

  1. Create a prediction for the next 24 months in the future using sarima.for(). Make a plot of those predictions and tell a brief story about what they can tell you.

ANSWER:

Paragraph [Person A]:

Paragraph [Person B]: