Building Flexible Models (Nonparametric, Nonlinear Models)
General concepts that translate to other Supervised Learning:
Overfitting
Cross Validation
Bias-Variance Tradeoff
Algorithms and tuning parameters
Preprocessing steps
Parametric v. Nonparametric models
Concept Quiz 1
Part 1
on paper
closed people, closed laptop
you can bring in an 8.5x11 sheet with notes. you can type, write small, write big, etc. you will hand this in with Part 1
Part 1 is due by the end of the class
you might be asked to interpret some R output, but I won’t ask you to provide any code
Part 2
on computers
you can chat with any current STAT 253 student, but nobody else (including preceptors)
you can DM or email me clarifying questions and if there is something confusing, I’ll share my answer with the entire class
you can use any materials from this STAT 253 course (from course site or Moodle or textbook), but no internet, ChatGPT, etc
this is designed to finish during class, but you can hand it in any time within 24 hours of your class end time (eg: 11:10am the next day for the 9:40am section)
Content
Units 1–3
questions range in style, including multiple choice, fill in the blank, short response, matching, etc
Focus on the structures and knowing what you can change and the impact on the conceptual algorithm
Small Group Activity
Group Assignment 1
10 minutes: Each person takes a turn giving a summary of what they learned about the data and their ability to predict arrival delays. Share the best CV MAE you got. Share your code (share Rmd’s via DM Slack) with each other.
5 minutes: Decide, as a group, what tools you want to use to build a predictive regression model and in what order.
20-30 minutes: Open up template Rmd from Moodle, copy code from individual Rmd’s to Implementation section to implement the agreed upon order as first draft.
You’ll need to refine and adjust code as you go.
One person will need to take “Lead” on typing but everyone should have a say in the conceptual choices. Decide on a way to share this code.
5 minutes: Decide who is going to “Lead” the writing of each section of the template and who is going to “Review” each section. Each section should have one lead who will write the first draft of text and another person to review/edit/improve.
Data (can be paired with Research Question)
Who, what, where, when, why, how
Given insight into the outcome and summary of available predictors (no list and no R variable names)
Model Building
Describe the tools and the order in which tools are used to build the model; justifying those choices
Model Evaluation
Answer the four questions in paragraph form
Work on sections and touch base with each other as needed.
After Class
Reflection & Review
Use Concept Map to help you with HW4 and synthesis for Concept Quiz 1.
Group Assignment
Before leaving, you should have a clear idea of what you are going to contribute to the assignment.
Upcoming due dates
Thursday: CP8 before class (classification via logistic regression)
Thursday: HW 4 (posted on Moodle)
Tuesday Feb 27: Concept Quiz 1 on Units 1–3 (up to and including today)
Work on Concept Map as a way to review & synthesize