x <-c(20100101120101, "2009-01-02 12-01-02", "2009.01.03 12:01:03","2009-1-4 12-1-4","2009-1, 5 12:1, 5","200901-08 1201-08","2009 arbitrary 1 non-decimal 6 chars 12 in between 1 !!! 6","OR collapsed formats: 20090107 120107 (as long as prefixed with zeros)","Automatic wday, Thu, detection, 10-01-10 10:01:10 and p format: AM","Created on 10-01-11 at 10:01:11 PM")ymd_hms(x)
hour(), min(), wday(), yday(), mday(), month(), year(), days_in_month(), etc.
. . .
Creating new versions of Date/Time
Numeric / Decimal Time = hour + minute / 60
Numeric / Decimal Month and Day = month + day / days_in_month
Others…
Useful for visualizing while making it accessible to broader public
Watch for…
Changes in the Mean
Changes in Variability
Unusual features / anomalies
. . .
Consider the data generating process
Watch for…
Recurring Patterns
Deterministic Patterns
Patterns you can explain using the data generating process (weekend v. weekday; winter v. summer; vacations; working outside the house)
. . .
Consider the data generating process
Watch for…
High correlation between lagged observations
Note: I added “jitter” (random noise) to see all of the points since they are all recorded as integers
. . .
Consider modeling this correlation
Watch for…
Correlation decreasing in magnitude with larger lags
Larger lag = further away in time
. . .
Consider how this correlation decays
Learning Goals
Know the properties of expected value and variance of a random variable.
Derive mathematical properties of covariance and correlation using properties of expected value and variance.
Probability Review
Probability Warm Up
For a discrete random variable \(X\),
How do you calculate the expected value? (definition)
List at least two properties of the expected value.
How do you calculate the variance? (definition)
List at least two properties of the variance.
Review: Covariance
Definition of covariance of two random variables:
\[Cov(X,Y) = E((X - \mu_x)(Y - \mu_y))\]
where \(E(X) = \mu_x\) and \(E(Y) = \mu_y\)
Review: Correlation
Definition of correlation of two random variables:
\[Cor(X,Y) = \frac{Cov(X,Y)}{SD(X)SD(Y)}\]
where \(SD(X) = \sqrt{Var(X)}\) and \(SD(Y) = \sqrt{Var(Y)}\)
Small Group Activity
You are going to prove three REALLY IMPORTANT properties of Covariance!
We’ll need these properties going forward.
Setup
Introduce yourselves and check in with each other as humans.
Discuss how you want to structure your collaboration. Consider:
Equitable time with marker
Equitable contributions to the process (adding to the proof, explaining why you can make the step)
Whose role it might be to seek resources (Chp 2 in the Notes, asking another group, asking instructor)
Be open and honest about how comfortable you feel about the challenge; support each other in the productive struggle.
Notes:
If you feel comfortable with the problem, don’t just do it yourself. Talk through strategies/approaches that might be useful; be a guide/coach for the group.
The goal is that everyone should feel comfortable explaining each step of the proofs.
Challenges
You may assume the properties of expected value (no need to prove those here). You may also assume #1 is true to prove #2 and assume #1 and #2 to prove #3.
Prove/show: \(Cov(aX,bY) = ab Cov(X,Y)\) for random variables \(X\) and \(Y\) and constants \(a,b\). Hint: Start with \(Cov(aX,bY)\) and using properties and the definition, rewrite it as \(ab Cov(X,Y)\).
Prove/show: \(Cov(X+Y,Z) = Cov(X,Z)+Cov(Y,Z)\) for random variables \(X\), \(Y\), and \(Z\). Hint: Start with \(Cov(X+Y,Z)\) and using properties and the definition, rewrite it.
Prove/show: \(Cov(aX+bY,cZ + dW) = acCov(X,Z)+adCov(X,W)+bcCov(Y,Z)+bdCov(Y,W)\) for random variables \(X\), \(Y\), \(Z\), and \(W\). Hint: Start by letting \(V = cZ + dW\).
Solutions
Probability Warm Up
Definition of Expected Value
Solution
For a discrete random variable \(X\),
\[E(X) = \sum_{i=1}^{\infty} x_i*P(X = x_i)\]
Properties of Expected Value
Solution
For random variables \(X\) and \(Y\) and constant \(a\),
Covariance is the average area of the rectangles (blue + , red -)
Recap: Covariance Properties in Pictures
\[Cov(aX,Y) = aCov(X,Y)\]
Scaling (multiplying by) a random variable by a constant stretches the area of the rectangles proportionally and thus the covariance proportionally
Finishing the Activity
If you didn’t finish the activity, no problem! Be sure to complete the activity outside of class, review the solutions in the online manual, and ask any questions on Slack or in office hours.
Re-organize and review your notes to help deepen your understanding, solidify your learning, and make homework go more smoothly!