# Tutorials#

This is a collection of notebooks with tips and consolidated references for the various Python and Pandas topics that we are discussing.

Note

This semester, I am teaching with plotnine instead of Seaborn, but

## Python and Data#

These notebooks are on basic Python data manipulation:

## Visualization#

1. Drawing Charts

2. Movie Score Charting Examples — example charts used in several videos in 📅 Week 3 — Presentation (9/5–9)

3. Charts from the Ground Up — notebook for 🎥 Charts from the Ground Up.

4. Chart Finishing Touches

## Probability and Statistics#

1. Empirical Probabilities (demonstration of using boolean series to compute probabilities with empirical data)

2. Magic Numbers (demonstration of where various “magic” numbers come from)

3. One Sample T-test and Distribution Comparison

4. Confidence

5. Correlation

6. Regressions (goes with Week 8)

7. Random Numbers

8. Logistic Regression

9. Sampling and Testing the Penguins

10. Linear Models with scipy minimize

11. Overfitting Simulation example

12. Random Sampling

## SciKit-Learn and ML Models#

1. SciKit-Learn Logistic Regression

2. SciKit-Learn Pipelines and Regularization — also includes a significance test for difference in classifier accuracy, and a decision tree

3. Linear Regression with SciKit-Learn — also uses a pipeline and applies standardization

5. Dummy-Coding and Feature Combination with SciKit-Learn Pipelines

6. Another advanced SciKit-Learn pipeline and logistic regression example (on Towards Data Science)

7. K-Means Example (uses the chi-papers data from Week 13)

8. Tuning Hyperparameters

## Specific Data Set Examples#

These are more advanced examples of data manipulation and collection:

1. MovieLens Time Series

2. Sessionization (demonstrates some more advanced aggregation and time-based operations)

3. Spam Filter demonstrates building a spam filter

4. Using the Census describes how to access census data.

5. Fetching CHI Papers creates the chi-papers.csv file from Internet sources.

## Workflow Example#

This example demonstrates a complete Git-based workflow: