Week 11 — Evaluation
There is no quiz this week.
This week's videos are also available as a Panopto playlist.
Intro & Context
In this video, I review where we are at conceptually, and recap the ideas of estimating conditional probability and expectation.
What are some useful techniques for engineering features in an application?
How do you do feature engineering and model selection in a machine learning workflow? What is the iterative process involved?
In this video, I introduce SciKit pipelines that put multiple transformations together.
SciKit Learn Pipelines
SciKit Learn Preprocessing
This video introduces regularization: ridge regression, lasso regression, and the elasticnet. Lasso regression can help with (semi-)automatic feature selection.
Pipeline and Regularization
This notebook demonstrates pipelines and \(L_2\) regression, and performs a significance test of classifier improvement.
It also shows a training of a decision tree (next video).
Models and Depth
What does the world look like beyond logistic regression? Can a model output be a feature?
Inference and Ablation
How do we understand, robustly, the performance of our system? What contributes to its performance?
Statistical Significance Tests
For further reading, you can also see Approximate Statistical Tests.
This video discusses how to use work with dates in Pandas.
- Date operations notebook
- Pandas time series / date functionality
- Pandas time deltas
- Format codes
Assignment 5 is due November 11, 2020.