Skip to content

Week 13 — Unsupervised

In this week, we are going to talk more about unsupervised learning — learning without labels. We are not going to have time to investigate these techniques very deeply, but I want you to know about them, and you are experimenting with them in Assignment 6.

This week's content is lighter, since we just had a large assignment and a midterm, and another assignment is due on Sunday.

There is no quiz this week due to the cluster of deliverables and upcoming assignment. I will double-check that this is not a grading problem for anyone.

This week's videos are also available as a Panopto playlist.

No Supervision

In this video, we review the idea of supervised learning and contrast it with unsupervised learning.

Decomposing Matrices

This video introduces the idea of matrix decomposition, which we can use to reduce the dimensionality of data points.

Resources

Movie Decomposition

The Movie Decomposition notebook demonstrates matrix decomposition with movie data.

Clustering

This video introduces the concept of clustering, another useful unsupervised learning technique.

Resources

Clustering Example

The clustering example notebook shows how to use the KMeans class.

Vector Spaces

This video talks about vector spaces and transforms.

Practice: SVD on Paper Abstracts

The Week 13 Exercise notebook demonstrates latent semantic analysis on paper abstracts and has an exercise to classify text into new or old papers.

It requires the chi-papers.csv file, which is derived from the HCI Bibliography. It is the abstracts from papers published at the CHI conference (the primary conference for human-computer interaction) over a period of nearly 40 years.

If you want to see how to create this file, see the Fetch CHI Papers example.

Assignment 6

Assignment 6 is due November 22, 2020.