# Week 5 β Filling In (9/19β23)#

This week introduces one new statistical concept β the hypothesis test β and is otherwise about practice and solidifying concepts. Iβm also going to take a step back and give some more context to some of the things weβre talking about.

Our learning outcomes are:

• Compute and interpret hypothesis test

• Avoid p-hacking and HARKing

• Understand how to read and interpret Python errors

• Understand how the quantitative techniques we are learning in this class fit in a broader landscape of epistemologies

## π§ Content Overview#

Element Length

π₯Β Comparing Distributions

5m6s

π₯Β Testing Hypotheses

14m51s

π₯Β T-tests

12m24s

πΒ Cookbook 1

2000 words

π₯Β Epistemology

25m44s

πΒ One Sample T-test and Q-Q Plot

653 words

π₯Β Python Errors

7m28s

π₯Β Python Libraries

3m43s

π₯Β Learning More

5m10s

This week has 1h14m of video and 2653 words of assigned readings. This weekβs videos are available in a Panopto folder.

• Week 5 Quiz is due on Thursday at 8AM.

• Assignment 2 is due on Sunday, September 25, 2022 at 11:59 PM.

• Midterm A is next week, on September 28.

## π Assignment 1 Solution#

The Assignment 1 solution is on Piazza.

## π Course Glossary#

If you havenβt yet, I highly recommend consulting the course glossary. Please post on Piazza if you have suggested additions!

The glossary is also likely to be useful in studying for the exam next week.

## π Writing Functions#

Iβve used Python functions in a few of my example notebooks. The function notebook talks more about them, how to write them, and how to use them.

## π₯ Comparing Distributions#

This video describes how to use Q-Q plots to compare data against a distribution.

## π₯ Cartoon#

This is called p-hacking: running tests until we find one that is significant.

## π₯ T-tests#

This video discusses the t-test in more detail, and the different kinds of t-tests that we can run. It also introduces degrees of freedom.

## π Tying It Together#

I will be adding a notebook reading here to tie together some Week 4 and 5 material.

## π₯ Epistemology#

In this video, I talk about how the quantitative data science methods we are learning fit into a broader picture of source of knowledge.

The Week 5 quiz is about material through this point. The subsequent videos are to help you better understand and contextualize material.

## π One Sample Notebook#

The One Sample notebook demonstrates how to compute a one-sample t-test, and draw a Q-Q plot to compare a distribution with normal.

## π₯ Python Errors#

This video discusses common Python errors and how to read errors.

## π₯ Learning More#

In this video I talk about how I go about expanding my own data science knowledge and techniques, with the goal of giving you ideas for how you can continue learning beyond this class.

## β Practice#

There are a few things you can do to keep practicing the material:

• The HETREC data contains two data sets besides the movie data: Delicious bookmarks and Last.FM listening records. Download this data set and apply some of our exploratory techniques to it.

• Download the SBA data from Week 4βs activity and describe the distributions of more of the variables.

• Apply the inference techniques from Week 4 to statistically test the differences you observed in Assignment 1.

## π More Examples#

Some more examples from my own work (these are not all cleaned up to our checklist standards):

## π Tutorials#

The tutorial notebooks include many useful things, and have a couple of additions moved over from πΒ Week 4 β Inference (9/12β16).