Regression: Model Evaluation

Brianna Heggeseth

As we gather

Sit with at least 2 new people and introduce yourself (you choose what you share) to others at your table.


Check out the themes from our Small Group Discussion last week.

Announcements

  • Consider signing up for the MSCS community listserv if you haven’t already.

    • This is where information is shared about MSCS-related events, internship opportunities, etc.
  • Prepare to take notes.

    • Locate the Rmd for today’s activity in the Schedule of the course website (see bottom of slides for url). This is where you’ll keep notes! Use the Rmd however is best for your own learning – you won’t hand it in.
  • Download and open the Rmd in RStudio. If your machine opens the file in a web browser, you can: (1) open a new Rmd in RStudio; and (2) copy-and-paste the contents from the web browser into your Rmd.

    • Save this Rmd in the “STAT 253 > Notes” folder that you created for CP1.
    • Knit the Rmd and check out the general structure.

Small Group Discussion

Go to https://bcheggeseth.github.io/253_spring_2024/model-evaluation.html


Go to > Small Group Discussion: Video Recap.

Be prepared to share a few highlights from the group discussion.

Notes - R Code

We are going to use tidymodels package in R.

  • Similar flavor to tidyverse structure
  • More general structure that allows us to fit many other types of models

Notes - R Code

It will seem like a lot more code and unnecessary.


For example, what you did in Stat 155 with

lm( y ~ x1 + x2, data = sample_data)


will look like

# STEP 1: model specification
lm_spec <- linear_reg() %>% # we want a linear regression model
  set_mode("regression") %>%  # this is a regression task (y is quantitative)
  set_engine("lm")# we'll estimate the model using the lm function

# STEP 2: model estimation
model_estimate <- lm_spec %>% 
  fit(y ~ x1 + x2, data = sample_data)

But you’ll need to trust me

Notes - R Code

Useful functions to use on model_estimate:


model_estimate %>% 
  tidy() #gives you coefficients (and se, t-statistics)


model_estimate %>% 
  augment(new_data = sample_data) # gives you predictions and residuals for sample_data


model_estimate %>% 
  glance() #gives you some model evaluation metrics (is it strong?)


model_estimate %>% 
  augment(new_data = sample_data) %>% 
  mae(truth = y, estimate = .pred) # calculates MAE to measure accuracy of predictions

In-Class Activity

Go back to https://bcheggeseth.github.io/253_spring_2024/model-evaluation.html

Let’s work through the first three together as a large group.

In-Class Activity Directions

Be kind to yourself

  • You will be rusty and make mistakes, and that’s great! Mistakes are important to learning.

Focus on Patterns in Code

  • Review but do not try to memorize any provided code. Focus on the general steps and patterns.

  • If you’re given some starter code with blanks (eg: dim(___)), don’t type in those chunks. Copy, paste, and modify the starter code in the chunk below it.

Collaboration

  • We’re sitting in groups for a reason. Collaboration improves higher-level thinking, confidence, communication, community, & more. You are expected to:

  • Actively contribute to discussion (don’t work on your own).

  • Actively include all other group members in discussion.

  • Create a space where others feel comfortable making mistakes & sharing their ideas (remember that you all have different experiences, both personal and academic).

  • Stay in sync while respecting that everybody has different learning strategies, work styles, note taking strategies, etc. If some people are working on exercise 10 and others on exercise 2, that’s not a good collaboration.

  • Don’t rush. You won’t hand anything in and can finish up outside of class.

NOTE: I will ask you to reflect upon your collaboration skills through the semester.

Ask questions

  • We will not discuss these exercises as a class. Your group should ask me questions as I walk around the room.

After Class

Finish the activity

  • Be sure to complete the activity outside of class, review the solutions in the course website, and ask any questions on Slack or in office hours.

  • Re-organize and review your notes to help deepen your understanding, solidify your learning, and make homework go more smoothly!

  • An R code video, posted for today on the Schedule, talks through the new tidymodels code. This video is OPTIONAL. Decide what’s right for you.

  • Set up Slack if you haven’t already. I’ll be posting announcements there from now on.

Upcoming due dates

  • Wednesday: Homework 1 (HW1)
    Consider inviting others to work with you through Slack!
  • Thursday, 10 minutes before your section: Checkpoint 2 (CP2)
    No video. This is a syllabus review & Slack task.