Skip to content

Week 11 — Evaluation

Activities:

There is no quiz this week.

This week's videos are also available as a Panopto playlist.

Intro & Context

In this video, I review where we are at conceptually, and recap the ideas of estimating conditional probability and expectation.

Feature Transforms

What are some useful techniques for engineering features in an application?

Workflow

How do you do feature engineering and model selection in a machine learning workflow? What is the iterative process involved?

SciKit Pipelines

In this video, I introduce SciKit pipelines that put multiple transformations together.

SciKit Learn Pipelines

Read the SciKit-Learn User Guide chapter on pipelines.

SciKit Learn Preprocessing

Read the SciKit-Learn User Guide chapter on pre-processing.

Regularization

This video introduces regularization: ridge regression, lasso regression, and the elasticnet. Lasso regression can help with (semi-)automatic feature selection.

Pipeline and Regularization

This notebook demonstrates pipelines and \(L_2\) regression, and performs a significance test of classifier improvement.

It also shows a training of a decision tree (next video).

Models and Depth

What does the world look like beyond logistic regression? Can a model output be a feature?

Inference and Ablation

How do we understand, robustly, the performance of our system? What contributes to its performance?

Statistical Significance Tests

Read Statistical Significance Tests for Comparing Machine Learning Algorithms.

For further reading, you can also see Approximate Statistical Tests.

Dates

This video discusses how to use work with dates in Pandas.

Assignment 5

Assignment 5 is due November 11, 2020.