Week 11 — More Modeling (11/1–5)¶
In this week, we’re going to learn more about model building, that will be useful in Assignment 5:
SciKit-Learn pipelines and workflows
Analyzing model results
🧐 Content Overview¶
🎥 Intro & Context¶
In this video, I review where we are at conceptually, and recap the ideas of estimating conditional probability and expectation.
🎥 Feature Transforms¶
What are some useful techniques for engineering features in an application?
How do you do feature engineering and model selection in a machine learning workflow? What is the iterative process involved?
🎥 SciKit Pipelines¶
In this video, I introduce SciKit pipelines that put multiple transformations together.
This video introduces regularization: ridge regression, lasso regression, and the elasticnet. Lasso regression can help with (semi-)automatic feature selection.
📓 Pipeline and Regularization¶
This notebook demonstrates pipelines and \(L_2\) regression, and performs a significance test of classifier improvement.
It also shows a training of a decision tree (next video).
📓 Advanced Pipelines¶
The Advanced Pipelines notebook demonstrates a much more advanced SciKit-Learn pipeline.
🎥 Models and Depth¶
What does the world look like beyond logistic regression? Can a model output be a feature?
🎥 Inference and Ablation¶
How do we understand, robustly, the performance of our system? What contributes to its performance?
📃 Statistical Significance Tests¶
In the Week 9 activity, we used the paired t-test for comparing the output of two regression models. Our use of this test did not violate the guidance in this reading — why is that?
For further reading, you can also see Approximate Statistical Tests.
This video discusses how to use work with dates in Pandas.
🚩 Quiz 11¶
Quiz 11 is in Canvas.