This is a collection of notebooks with tips and consolidated references for the various Python and Pandas topics that we are discussing.

Python and Data

These notebooks are on basic Python data manipulation:

  1. Fun with Numbers

  2. Writing and Using Functions

  3. Selecting Data

  4. Reshaping Data

  5. Building Data — building up arrays and data series

  6. Indexing

  7. Missing Data

  8. Tricks with Boolean Series

Specific Data Set Examples

These are more advanced examples of data manipulation and collection:

  1. MovieLens Time Series

  2. Sessionization (demonstrates some more advanced aggregation and time-based operations)

  3. Spam Filter demonstrates building a spam filter

  4. Using the Census describes how to access census data.

  5. Fetching CHI Papers creates the chi-papers.csv file from Internet sources.

Workflow Example

This example demonstrates a complete Git-based workflow:

  1. Git repo & workflow example