(tutorials)=
# Tutorials
This is a collection of notebooks with tips and consolidated references for the various Python and Pandas topics that we are discussing.
:::{note}
This semester, I am teaching with plotnine instead of Seaborn, but
:::
(python-and-data)=
## Python and Data
These notebooks are on basic Python data manipulation:
1. [Fun with Numbers](FunWithNumbers.ipynb)
2. [Types and Operations](TypesAndOperations.ipynb)
3. [Writing and Using Functions](Functions.ipynb)
4. [Selecting Data](Selection.ipynb)
5. [Reshaping Data](Reshaping.ipynb)
6. [Building Data](BuildingData.ipynb) — building up arrays and data series
7. [Indexing](Indexing.ipynb)
8. [Missing Data](MissingData.ipynb)
9. [Tricks with Boolean Series](BooleanSeries.ipynb)
## Visualization
1. [Drawing Charts](Charting.ipynb)
2. [Movie Score Charting Examples](CriticScores.ipynb) — example charts used in several videos in {module}`week3`
3. [Charts from the Ground Up](ChartsFromTheGroundUp.ipynb) — notebook for {video}`week3:charts-ground-up`.
4. [Chart Finishing Touches](ChartFinishingTouches.ipynb)
## Probability and Statistics
1. [Penguin Inference](PenguinSamples.ipynb) (from {module}`week4`)
2. [Empirical Probabilities](EmpiricalProbabilities.ipynb) (demonstration of using boolean series to compute probabilities with empirical data)
3. [Probability Distributions](Distributions.ipynb) (from {module}`week4`)
4. [Sampling Distributions](SamplingDists.ipynb) (from {module}`week4`)
5. [Magic Numbers](MagicNumbers.ipynb) (demonstration of where various “magic” numbers come from)
6. [One Sample T-test and Distribution Comparison](OneSample.ipynb)
7. [Confidence](Confidence.ipynb)
8. [Correlation](Correlation.ipynb)
9. [Regressions](Regressions.ipynb) (goes with [Week 8](../../week8/index.md))
10. [Random Numbers](RandomNumbers.ipynb)
11. [Logistic Regression](LogitRegressionDemo.ipynb)
12. [Sampling and Testing the Penguins](PenguinSamples.ipynb)
13. [Linear Models with scipy `minimize`](MinimizeRegression.ipynb)
14. [Overfitting Simulation example](OverfittingSimulation.ipynb)
15. [Random Sampling](RandomSearchWorks.ipynb)
## SciKit-Learn and ML Models
1. [SciKit-Learn Logistic Regression](SciKitLogistic.ipynb)
2. [SciKit-Learn Pipelines and Regularization](SciKitPipeline.ipynb) — also includes a significance test for difference in classifier accuracy, and a decision tree
3. [Linear Regression with SciKit-Learn](SciKitRegression.ipynb) — also uses a pipeline and applies standardization
4. [Advanced SciKit-Learn Pipeline](AdvancedPipeline.ipynb)
5. [Dummy-Coding and Feature Combination with SciKit-Learn Pipelines](SciKitTransform.ipynb)
6. [Another advanced SciKit-Learn pipeline and logistic regression example](https://towardsdatascience.com/logistic-regression-classifier-on-census-income-data-e1dbef0b5738) (on Towards Data Science)
7. [Movie Decomposition](MovieDecomp.ipynb) from {video}`week13:decomp`
8. [PCA demo](PCADemo.ipynb) from {video}`week13:decomp`
9. [K-Means Example](ClusteringExample.ipynb) (uses the chi-papers data from [Week 13](../../week13/index.md#practice))
10. [Tuning Hyperparameters](TuningExample.ipynb)
## Specific Data Set Examples
These are more advanced examples of data manipulation and collection:
1. [MovieLens Time Series](MLTimeSeries.ipynb)
2. [Sessionization](Sessions.ipynb) (demonstrates some more advanced aggregation and time-based operations)
3. [Spam Filter](SpamFilter.ipynb) demonstrates building a spam filter
4. [Using the Census](UsingTheCensus.ipynb) describes how to access census data.
5. [Fetching CHI Papers](FetchCHIPapers.ipynb) creates the `chi-papers.csv` file from Internet sources.
## Workflow Example
This example demonstrates a complete Git-based workflow:
1. [Git repo & workflow example](https://github.com/BoiseState/cs533-hcibib-demo)