Week 6 — Two Variables (Sep. 27–Oct. 1)

Attention

The first midterm exam is on Tuesday.

This week’s learning outcomes are:

  1. Display two potentially-related numeric variables for exploratory analysis.

  2. Compute correlation coefficients between variables

  3. Run a linear regression

Since we have the exam this week, the lecture load is significantly reduced.

🧐 Content Overview

Element Length

🎥 Two Variables Intro

4m36s

🎥 Displaying Variables

3m45s

🎥 Correlation

11m12s

🎥 Regression

6m3s

🎥 Features

4m20s

This week has 0h30m of video and 0 words of assigned readings. This week’s videos are available in a Panopto folder and as a podcast.

📅 Deadlines

  • Midterm A Tuesday 9:00–10:15 AM (in class)

  • Quiz 6 Thursday by 8AM

🚩 Midterm A

The first midterm is on Tuesday. It is written to take about an hour, and covers material up through and including Week 5.

Format

The exam is a written in-person exam. It will contain a variety of questions to assess your ability to understand and apply concepts from the class. Question formats include:

  • Multiple-choice

  • True/false

  • Matching

  • Fill-in-the-blank

  • Short answer

I may ask you to do a range of things on the exam, including (but not limited to):

  • Define a concept

  • Compute a metric from a small quantity of data

  • Interpret a chart

Exam Rules

  • You may have 1 note sheet, letter- or A4-sized, single-sided. (For the final, you will be allowed a two-sided note sheet.)

  • You should not need a calculator, but may bring one if you wish.

  • You may answer in either pen or pencil.

Study Tips

  • Review the previous quizzes and assignments.

  • Review lecture slides to see where you are unclear on concepts and need to review.

  • Skim assigned readings, particularly the section headings to remind yourself what was in them.

  • Review the course glossary, keeping in mind that it does contain terms we haven’t gotten to yet.

🎥 Introduction

This video introduces the week’s topic.

🎥 Displaying Variables

This video discusses how to display related numeric variables.

🎥 Correlation

This video discusses how to compute the correlation coefficient between two variables.

Warning

In this video, I list the Pandas correlation function as cor. The correct name is corr.

🎥 Regression

This video discusses how to fit a line between two variables.

📓 Correlation Notebook

The correlation notebook shows how to compute the metrics in this week’s videos, and has the code I used to produce the charts in the slides.

🎥 Features

This video introduces the idea of feature engineering

🚩 Week 6 Quiz

The Week 6 quiz is due before class on Thursday as usual.

📩 Assignment 3

Assignment 3 is due October 10.