Tutorials¶

This is a collection of notebooks with tips and consolidated references for the various Python and Pandas topics that we are discussing.

Python and Data¶

These notebooks are on basic Python data manipulation:

Fun with Numbers
Writing and Using Functions
Selecting Data
Reshaping Data
Building Data — building up arrays and data series
Indexing
Missing Data
Tricks with Boolean Series

Visualization¶

Drawing Charts
Movie Score Charting Examples — example charts used in several videos in 📅 Week 3 — Presentation (9/6–10)
Chart Finishing Touches

Probability and Statistics¶

Penguin Inference (from 📅 Week 4 — Inference (9/13–17))
Empirical Probabilities (demonstration of using boolean series to compute probabilities with empirical data)
Probability Distributions (from 📅 Week 4 — Inference (9/13–17))
Sampling Distributions (from 📅 Week 4 — Inference (9/13–17))
One Sample T-test and Distribution Comparison
Correlation
Regressions (goes with Week 8)
Logistic Regression
Sampling and Testing the Penguins
Linear Models with scipy minimize
Overfitting Simulation example

SciKit-Learn and ML Models¶

SciKit-Learn Logistic Regression
SciKit-Learn Pipelines and Regularization — also includes a significance test for difference in classifier accuracy, and a decision tree
Linear Regression with SciKit-Learn — also uses a pipeline and applies standardization
Advanced SciKit-Learn Pipeline
Dummy-Coding and Feature Combination with SciKit-Learn Pipelines
Another advanced SciKit-Learn pipeline and logistic regression example (on Towards Data Science)
Movie Decomposition from 🎥 Decomposing Matrices
PCA demo from 🎥 Decomposing Matrices
K-Means Example (uses the chi-papers data from Week 13)
Tuning Hyperparameters

Specific Data Set Examples¶

These are more advanced examples of data manipulation and collection:

MovieLens Time Series
Sessionization (demonstrates some more advanced aggregation and time-based operations)
Spam Filter demonstrates building a spam filter
Using the Census describes how to access census data.
Fetching CHI Papers creates the chi-papers.csv file from Internet sources.

Workflow Example¶

This example demonstrates a complete Git-based workflow:

Git repo & workflow example

previous

Software Environments

next

Advanced Pipeline Example