Checkpoint 1

You can download a template RMarkdown file to start from here.

You may complete this checkpoint with pencil and paper (make sure to clearly mark the questions and parts) or with Latex within Rmarkdown.

If you choose to use Latex, I’ve provided you with some structure. Check out https://www.caam.rice.edu/~heinken/latex/symbols.pdf for a cheatsheet on writing math with Latex.

Part 1: Covariance Matrices

The covariance matrix is a matrix of all pairwise covariances, organized into a matrix form. The \(ij\)th element of the matrix is \(Cov(X_i,X_j) = \sigma_{ij}\). Correlation between two random variables \(X_i\) and \(X_j\) is defined as \(\rho_{ij} = \frac{Cov(X_i,X_j)}{\sqrt{Var(X_i)}\sqrt{Var(X_j)}}=\frac{\sigma_{ij}}{\sqrt{\sigma^2_{i}}\sqrt{\sigma^2_{j}}}\).

Using the definition of covariance of two random variables, \(Cov(X,Y) = E((X - \mu_x)(Y-\mu_y))\), properties of random vectors, and basics of matrix algebra (see Chp 8 in online notes), show the following:

Show that the equation below results in a covariance matrix with the variances are along the diagonal and covariances on the off diagonal for a random vector \(\mathbf{X} = (X_1,X_2)\). \[\boldsymbol{\Sigma} = E((\mathbf{X} - E(\mathbf{X}))(\mathbf{X} - E(\mathbf{X}))^T)\]

ANSWER:

\[E((\mathbf{X} - E(\mathbf{X}))(\mathbf{X} - E(\mathbf{X}))^T) = E\left[\left( \left(\begin{array}{c} X_1\\ X_2 \end{array}\right) - E\left(\begin{array}{c} X_1\\ X_2 \end{array}\right)\right) \left( \left(\begin{array}{c} X_1\\ X_2 \end{array}\right) - E\left(\begin{array}{c} X_1\\ X_2 \end{array}\right)\right)^T \right]\]

\[ = \; ...\]

Show the equation below results in a covariance matrix with the variances are along the diagonal and covariances on the off diagonal for a random vector \(\mathbf{X} = (X_1,X_2)\).

\[\boldsymbol{\Sigma} =\mathbf{V}^{1/2}\boldsymbol\Gamma \mathbf{V}^{1/2}\] where \(\mathbf{V}^{1/2}\) is a diagonal matrix with standard deviations (\(\sigma_1,\sigma_2\)) along the diagonal and \(\boldsymbol\Gamma\) is the correlation matrix.

ANSWER:

\[\mathbf{V}^{1/2}\boldsymbol\Gamma \mathbf{V}^{1/2} = \left(\begin{array}{cc} \sigma_1 & 0\\ 0 & \sigma_2 \end{array}\right) \left(\begin{array}{cc} 1 & \rho_{12}\\ \rho_{12} & 1 \end{array}\right) \left(\begin{array}{cc} \sigma_1 & 0\\ 0 & \sigma_2 \end{array}\right)\]

\[ = \; ...\]

Using what you proved above, \(\boldsymbol{\Sigma} = E((\mathbf{X} - E(\mathbf{X}))(\mathbf{X} - E(\mathbf{X}))^T)\), and the properties of matrix algebra (see Chp 8 in online notes), prove the following:

\(Cov(\mathbf{AX}) = \mathbf{A}Cov(\mathbf{X})\mathbf{A}^T\) for a random vector \(\mathbf{X}\) of length \(k\) and \(k\times k\) constant matrix \(\mathbf{A} = \left(\begin{array}{cccc}a_{11}&a_{12}&\cdots&a_{1m}\\a_{21}&a_{22}&\cdots&a_{2m}\\ \vdots&\vdots&\ddots&\vdots\\ a_{m1}&a_{m2}&\cdots&a_{mm}\end{array}\right)\).

You may use anything that you have proved thus far.

ANSWER:

\[Cov(\mathbf{AX}) = E((\mathbf{AX} - E(\mathbf{AX}))(\mathbf{AX} - E(\mathbf{AX}))^T)\] \[ = \; ...\]

Prove the following theorem for continuous and discrete random variables: If \(X_l\) and \(X_j\) are independent, then \(Cov(X_l, X_j) = 0\). You may use anything that you have proved thus far and basic definitions of expected value, variance, and covariance for random variables and vectors.

Note: The converse is not true. (You don’t need to prove this)

Hint: Two random variables are said to be statistically independent if and only if \[f(x_l,x_j) = f_{l}(x_l)f_j(x_j) \] for all possible values of \(x_l\) and \(x_j\) for continuous random variables and \[P(X_l = x_l, X_j = x_j)=P(X_l=x_l)P(X_j=x_j) \] for discrete random variables.

ANSWER: Assume \(X_l\) and \(X_j\) are independent. Then,

\[Cov(X_l, X_j) = \] \[ = \; ...\]

Part 2: Estimate Covariance and Correlation

This is an extension of an in-class activity.

Imagine that we generate a time series where the next observation is equal to the 0.90 times the past value plus some independent noise. Use R to generate a time series of 500 observations, plot that series, \(x_t\), as a function of the time index \(t\) and then estimate the covariance and correlation function assuming it is stationary.
Imagine that we generate a time series where the next observation is equal to the -0.30 times the past value plus some independent noise. Use R to generate a time series of 500 observations, plot that series, \(x_t\), as a function of the time index \(t\) and then estimate the covariance and correlation function assuming it is stationary.
Similar to class, create a 10x10 covariance matrix where the constant variance is 0.25 and the correlation for all lags is 0.7 (except lag = 0). Then use the Cholesky Decomposition method to generate 500 series of 10 observations with that covariance structure. Then, estimate the covariance and correlation matrices WITHOUT assuming it is stationary. To do this, organize the 500 series of 10 observations into a 500x10 matrix Y and use cov(Y).