Week 9 — Models & Prediction
Activities:
- Introduction (3m7s)
- Simulation (14m48s)
- Variance, R², and the Sum of Squares (5m59s)
- Overfitting (10m31s)
- Overfitting Simulation
- Overfitting Example (1180 words)
- Week 9 Quiz
- Replication, Bias, and Variance (9m16s)
- Bias-Variance Tradeoff (3000 words)
- Optimizing Loss (14m3s)
- Loss-Based Regression Notebook
- Practice
- Assignment 4
This week's videos are also available in a Panopto playlist.
Introduction
This video introduces the week.
Simulation
This video talks more about simulation as a method for studying statistical techniques, which you are doing in the assignment. I also describe more of NumPy's random number generation facilities.
Tip
You should set random seeds for all work that will need randomness, including train/test splits for evaluating predictors.
Variance, R², and the Sum of Squares
This video provides more detail on explained variance and what the
Resources
Overfitting
This video introduces the idea of overfitting: learning too much from the training data so we can't predict the testing data.
Overfitting Simulation
Overfitting Example
Read Example of overfitting and underfitting in machine learning.
Week 9 Quiz
Take the Week 9 quiz in Blackboard (will be up by end of Saturday).
Replication, Bias, and Variance
Bias-Variance Tradeoff
Read Understanding the Bias-Variance Tradeoff.
Resources
Further reading: Lecture 12: Bias-Variance Tradeoff.
Optimizing Loss
Loss-Based Regression Notebook
Read the minimization regression notebook notebook.
Practice
There are several ways you can practice the material so far:
- Practice more regressions with World Bank data
- Measure World Bank data predictive accuracy with train-test evaluation and mean squared error
Assignment 4
Assignment 4 is due October 25, 2020.