Learning Goals

The goal of this course is for you to further develop general skills necessary for statistics and data science and gain a working understanding of advanced modeling for correlated data (time series, longitudinal data, and spatial data).

Specific course topics and general skills are listed below.

General Skills

Data Communication

  • In written and oral formats:

    • Inform and justify data analysis and modeling process and the resulting conclusions with clear, organized, logical, and compelling details that adapt to the background, values, and motivations of the audience and context in which communication occurs.

Collaborative Learning

  • Understand and demonstrate characteristics of effective collaboration (team roles, interpersonal communication, self-reflection, awareness of social dynamics, advocating for yourself and others).
  • Develop a common purpose and agreement on goals.
  • Be able to contribute questions or concerns in a respectful way.
  • Share and contribute to the group’s learning in an equitable manner.
  • Develop a familiarity and comfort in using collaboration tools such as Git and Github.

Course Topics

Specific learning objectives for our course topics are listed below. Use these to guide your synthesis of course material for specific topics.

Foundations

Introduction to Correlated Data

  • Explain the similarities and differences between time series, longitudinal, and spatial data.
  • Explain why and how standard methods such as linear regression (estimated by ordinary least squares, OLS) fail on correlated data.
  • Be comfortable with working with time/date data and the lubridate R package.
  • Start to become comfortable playing with and manipulating data in R.


Probability Review

  • Know the properties of expected value and variance of a random variable.
  • Derive mathematical properties of covariance and correlation using properties of expected value and variance.


Random Processes

  • Understand mathematical notation used to describe the first and second moments of a sequence of random variables (random process).
  • Construct a covariance matrix from a given autocovariance function.
  • Generate correlated data from a given covariance matrix.


Modeling Covariance

  • Explain and illustrate how we can model covariance by constraining a covariance matrix.
  • Explain and illustrate how we can model covariance by constraining an autocovariance function.
  • Implement covariance model estimation by assuming stationarity (ACF and Semi-Variogram) and explain the R output.


Modeling Components and Detrending

  • Explain the differences between model components such as trend, seasonality, serial error, and measurement error.
  • Understand and implement the tools available to model and remove trends (e.g. polynomials, splines, local regression, moving average filters, differencing) and seasonality (e.g. indicators, sine and cosine curves, seasonality differencing)


Time Series

Time Series - ACF, Random Walk

  • Explain and implement covariance and correlation model estimation by assuming stationarity (ACVF, ACF).
  • Understand the derivations of variance and covariance for the Random Walk.
  • Generate data from a random walk in R and estimate variance and covariance.


Time Series - AR(p) and MA(q)

  • Understand the derivations of variance and covariance for the AR(1) and MA(1) model.
  • Understand the notation for an AR(p) and MA(q) models and the general mathematical approaches for deriving variance, covariance, and correlation.
  • Recognize general patterns of non-stationarity, AR(p) models, and MA(q) models in example ACF graphs.


Time Series - ARMA(p,q)

  • Understand the notation for an ARMA(p,q) and the general mathematical approaches for deriving variance and covariance.
  • Explain and implement partial autocorrelation function estimation by assuming stationarity (PACF).
  • Explain the common patterns in ACF and PACF for MA(q) and AR(p) models.
  • Fit ARMA models to stationary detrended data.


Time Series - ARIMA and SARIMA

  • Explain and illustrate how first and second order differences remove linear and quadratic trends.
  • Explain and illustrate how seasonal differences remove seasonality.
  • Fit ARIMA and SARIMA models to real data and use model selection tools to decide on a final model.
  • Explain and illustrate the general theoretical procedure for doing forecasting.


Longitudinal

The learning goals may be adjusted before we start the material of this section.

Longitudinal - Introduction

  • Explain and illustrate the differences between ordinary least squares (OLS) and generalized least squares (GLS)


Longitudinal - GEE

  • Explain the common model components of a general linear model (GLM)
  • Explain the ideas of working correlation models and robust standard error
  • Fit GEE models to real data and interpret the output


Longitudinal - Mixed Effects

  • Explain and illustrate a random intercept model and connect to compound symmetry model
  • Explain how random intercepts and slopes model correlation
  • Fit mixed effects models to real data and interpret the output

Longitudinal - To GEE or Not To GEE

  • Explain and illustrate the differences between GEE and Mixed Effects Models


Spatial

The learning goals may be adjusted before we start the material of this section.

Spatial - 3 Types of Data

  • Explain and detect three different types of spatial data (point process, areal data, geospatial)
  • Formulate research questions the three types of spatial data


Spatial - Intro to Mapping in R

  • Develop comfort in working with mapping in R (sf package)


Spatial - Areal Data

  • Explain and illustrate the concept of spatial neighbors
  • Explain and illustrate the concept of spatial correlation through Moran’s I

Spatial - CAR, SAR Models

  • Explain and connect CAR and SAR models to time series models
  • Implement CAR and SAR models in R