3.3 Models, Simplifications, & Constraints

We’ve been discussing covariance and correlation in terms of probability theory so far. What about if we have data?!

We need replicates to estimate any parameter (mean, covariance, correlation, etc.). For covariance of pairs of variables, that means we need either

multiple realizations of a random process (which could be multiple subjects), OR
simplifying assumptions about the structure of the autocovariance function.

3.3.1 Common Constraints

When we model covariance, we often make simplifying assumptions and use model constraints to estimate the covariance from the data. Below are some of the common constraints or assumptions we use. Like any assumption, we should check that it is a valid assumption to make with a data set before reporting on the conclusions of the model.

3.3.1.1 Weak Stationarity

If the autocovariance function of a random process only depends on the difference in time/space, \(s-t\) (e.g., covariance depends only on the difference in observation times and not time itself), then we say that the autocovariance is weakly stationary, such that

\[\Sigma_Y(t, s) = \Sigma_Y(s-t)\] For example, this means that the covariance between the 1st and 4th observation is the same as the 2nd and 5th, 3rd and 6th, etc. This simplifies our function as we only need to know the difference in time/space rather than the exact time/space.

Additionally, a random process that is weakly stationary has a constant mean, \(\mu = E(Y_t) = E(Y_s)\), and constant variance, \(\sigma^2=Var(Y_t) = Var(Y_s)\). When we use the term “stationary” from now on, we refer to a “weak stationary” process.

A random process is weakly stationary if it has a

Constant Mean \(\mu = E(Y_t) = E(Y_s)\)
Constant Variance \(\sigma^2=Var(Y_t) = Var(Y_s)\)
Autocovariance (and autocorrelation) Function is only a function of the vector difference in time/space

In a covariance matrix, the constant variance of stationarity, \(\sigma_{i}^2 = \sigma^2\), means that we can write the covariance matrix as the variance times correlation matrix,

\[\boldsymbol{\Sigma}_Y = \sigma^2\mathbf{R}_Y\] where \(\mathbf{R}_Y\) is the correlation matrix.

Additionally, with the stationary assumption, the correlation matrix can be written in terms of the vector difference in time/space, similar to the covariance matrix.

If we are considering spatial data where space is defined by longitude and latitude, a weakly stationary process means that the covariance depends on the difference between two points in space, which is defined by both the distance (as the crow flies Euclidean distance) and the direction (e.g., a point on a compass such as NE or West or defined as the angle degrees from North). In linear algebra terminology, the difference is the vector difference.

3.3.1.2 Isotropy

A stronger assumption than weak stationarity is that the autocovariance function depends only on the distance in time/space, \(||s-t||\).

If this is true, then we typically say that the autocovariance function and random process are isotropic, such that the dependence doesn’t change as a function of the angle of the difference in space (NE v. West) but just the vector distance or vector norm. That is, \[\Sigma_Y(t, s) = \Sigma_Y(||s-t||)\]

Notes:

This distinction only applies to spatial data in which our index is a vector of length two or more.
Many spatial models we will use assume an isotropic covariance function.

3.3.1.3 Intrinsic Stationarity

If assuming a process is weak stationary is too strong of a constraint, a slightly weaker constraint is that a random process is intrinsic stationary if

\[Var(Y_{s+h} - Y_s)\text{ depends only on h}\] meaning that the variance of the difference of random variables only depends on the vector difference, \(h\), in time or space.

This is similar to the concept of weak stationarity, but weak stationarity implies intrinsic stationarity but not vice versa.

3.3.2 Common Model Structures

Here are some common correlation functions that are all weakly stationary and isotropic:

3.3.2.1 Compound Symmetry / Exchangeable correlation

The correlation between two random variables is constant, no matter the difference in time/space.

The autocorrelation function is defined as

\[\rho_Y(s-t) = \begin{cases} \rho \text{ if } s-t \not= 0\\ 1 \text{ if } s-t = 0\\ \end{cases}\]

tibble(
  h = seq(0,5,length = 500), #distance
  rho = c(1,rep(0.5, 499))
) %>%
  ggplot(aes(x = h, y = rho)) +
  geom_line() + 
  geom_hline(yintercept = 0,color='grey') +
  ylim(-1,1) +
  labs(title = 'Autocorrelation Function:\nExchangeable/Compound Symmetry', x = 'Distance') +
  theme_classic()

3.3.2.2 Exponential Correlation

The correlation between two random variables decays to 0 exponentially as the difference in time/space increases.

\[\rho_Y(s-t) = e^{-||s-t||/\phi}\]

where \(\phi > 0\)

phi = 2
tibble(
  h = seq(0,5,length = 500) #distance
) %>%
  mutate(rho = exp(-h/phi)) %>%
  ggplot(aes(x = h, y = rho)) +
  geom_line() + 
  geom_hline(yintercept = 0,color='grey') +
  ylim(-1,1) +
  labs(title = 'Autocorrelation Function:\nExponential', x = 'Distance') +
  theme_classic()

3.3.2.3 Squared Exponential / Gaussian Correlation

The correlation between two random variables decays (slightly different decay exponentially) to 0 as the difference in time/space increases. It decays slower than exponential for differences < \(\phi\), but for differences > \(\phi\), it decays faster.

\[\rho_Y(s-t) = e^{-(||s-t||/\phi)^2}\] where \(\phi > 0\)

phi = 2
tibble(
  h = seq(0,5,length = 500) #distance
) %>%
  mutate(rho = exp(-(h/phi)^2)) %>%
  ggplot(aes(x = h, y = rho)) +
  geom_line() + 
  geom_hline(yintercept = 0,color='grey') +
  ylim(-1,1) +
  labs(title = 'Autocorrelation Function:\nSquared Exponential/Gaussian', x = 'Distance') +
  theme_classic()

3.3.2.4 Spherical Correlation

The correlation between two random variables decays (different decay than exponentially) to 0 as the difference in time/space increases.

\[\rho_Y(s-t) =\begin{cases} 1 - 1.5(||s-t||/\phi) + 0.5(||s-t||/\phi)^3 \text{ if } ||s-t|| < \phi \\ 0 \text{ otherwise }\\ \end{cases}\]

where \(\phi > 0\)

phi = 2
tibble(
  h = seq(0,5,length = 500) #distance
) %>%
  mutate(rho = if_else(h<phi,1 - 1.5*(h/phi) + 0.5*(h/phi)^3,0)) %>%
  ggplot(aes(x = h, y = rho)) +
  geom_line() + 
  geom_hline(yintercept = 0,color='grey') +
  ylim(-1,1) +
  labs(title = 'Autocorrelation Function:\nSpherical', x = 'Distance') +
  theme_classic()

Let’s look at the last three on the same graph. With the same value of \(\phi\), the correlation function can look quite different.

h = seq(0,5,by=.1) #distance
phi = 2
rho = rep(0, length(h))
rho[h<phi] = 1 - 1.5*(h[h<phi]/phi) + 0.5*(h[h<phi]/phi)^3
rho2 = exp(-(h/phi)^2)
rho3 = exp(-h/phi)
plot(h, rho, type ='l', ylim=c(0,1))
lines(h, rho2, lty=2)
lines(h, rho3, lty=3)