Checkpoint 2
You can download a template RMarkdown file to start from here.
This work should be done individually meaning that each person should have their own keyword and data set. You should support each other in your work but make sure that what you type in this document reflects your words and ideas.
Google Trends
Go to https://trends.google.com/trends/?geo=US and try out a variety of keywords. Make sure to change the plot to 2004-present. The time series you see is the search interest relative to the highest point on the chart for the given region and time. A value of 100 is the peak popularity for the term. A value of 50 means that the term is half as popular. A score of 0 means there was not enough data to estimate the search interest for this term.
- Choose a keyword associated with a search that has a series with a non-constant trend and some seasonality. Tell me what that keyword is.
ANSWER:
- Download the Google trend data (look for the down arrow). Put the file in a known location on your computer. Look at the csv file and then read in the data into R.
library(dplyr)
library(ggplot2)
library(readr)
- Come up with at least two estimates of the trend. Plot both of those estimates as a function of time. Justify which estimate you prefer.
ANSWER:
- Plot the de-trended series (residuals = original outcome data - estimated trend). Comment on the de-trended series.
ANSWER:
- Estimate the seasonality of the de-trended series. Plot the average cycle. Comment on the average cycle.
ANSWER:
- Plot the de-trended series after removing the seasonality (so you are left with the errors). Comment on what you observe.
ANSWER:
- Estimate the autocorrelation of the errors (using
acf()
assuming stationarity) and comment on what you observe about the errors.
ANSWER:
- Predict the relative interest in your keyword based on the trend and seasonality for the next 5 months. We haven’t formally learned how to do this but think about how you might do this; be creative. You goal is to see if you can create a plot of the original data and add a red line that shows this prediction. Do your best.
ANSWER: