Week 9 — Models & Prediction (Oct. 18–22)

This week talks more about regression, simulation, and introduces the idea of minimizing a loss function.

🧐 Content Overview

Element Length

🎥 More Regression

3m7s

🎥 Simulation

14m48s

🎥 Variance and Sums of Squares

5m59s

🎥 Overfitting

10m31s

📃 Example of overfitting and underfitting in machine learning

1180 words

🎥 Bias-Variance

9m16s

📃 Understanding the Bias-Variance Tradeoff

3000 words

🎥 Optimizing Loss

15m23s

This week has 0h59m of video and 4180 words of assigned readings. This week’s videos are available in a Panopto folder and as a podcast.

🎥 Introduction

This video introduces the week.

🎥 Simulation

This video talks more about simulation as a method for studying statistical techniques, which you are doing in the assignment. I also describe more of NumPy’s random number generation facilities.

Tip

You should set random seeds for all work that will need randomness, including train/test splits for evaluating predictors.

🎥 Variance, R², and the Sum of Squares

This video provides more detail on explained variance and what the \(R^2\) means.

🎥 Overfitting

This video introduces the idea of overfitting: learning too much from the training data so we can’t predict the testing data.

📓 Overfitting Simulation

🎥 Replication, Bias, and Variance

📃 Bias-Variance Tradeoff

Read Understanding the Bias-Variance Tradeoff.

Resources

Further reading: Lecture 12: Bias-Variance Tradeoff.

🎥 Optimizing Loss

🚩 Week 9 Quiz

Take the Week 9 quiz in Blackboard (will be up by end of Saturday).

Since this is the second of two very closely intertwined weeks, there are questions about 📅 Week 8 — Regression (Oct. 11–15) in the quiz ads well.

✅ Practice

There are several ways you can practice the material so far:

  • Practice more regressions with World Bank data

  • Measure World Bank data predictive accuracy with train-test evaluation and mean squared error

📩 Assignment 4

Assignment 4 is due October 24, 2021.