Week 13 — Unsupervised

In this week, we are going to talk more about unsupervised learning — learning without labels. We are not going to have time to investigate these techniques very deeply, but I want you to know about them, and you are experimenting with them in Assignment 6.

This week's content is lighter, since we just had a large assignment and a midterm, and another assignment is due on Sunday.

There is no quiz this week due to the cluster of deliverables and upcoming assignment. I will double-check that this is not a grading problem for anyone.

No Supervision
Decomposing Matrices
Movie Decomposition
Clustering
Clustering Example
Vector Spaces
Practice: SVD on Paper Abstracts
Assignment 6

This week's videos are also available as a Panopto playlist.

No Supervision

In this video, we review the idea of supervised learning and contrast it with unsupervised learning.

Video

Slides

CS 533INTRO TO DATA SCIENCE Michael Ekstrand NO SUPERVISION Learning Outcomes (Week) Distinguish between supervised and unsupervised learning. Project data into lower-dimensional space with matrix factorization. Cluster data points. Photo by Benedikt Geyer on Unsplash Learning So Far We learn to predict a label Categorical label → classification Continuous label → regression This is called supervised learning We have ground truth for outcome Sometimes called supervision signal Unsupervised Learning What can we do without a supervision signal? Group instances together (clustering) Learn vector spaces for items Learn relationships between items Learn relationships between features Middle Ground: Self-Supervised Learning Sometimes we can extract supervision signals from data Word embeddings: predict if two words appear together Why? Exploring data Reducing data complexity For visualization For learning (“curse of dimensionality”) Inputs into other models Sometimes it’s all we have Wrapping Up Unsupervised learning learns patterns from data without labels. It’s useful for grouping items together, exploration, and as input to other models. Photo by Fran Jacquier on Unsplash

Decomposing Matrices

This video introduces the idea of matrix decomposition, which we can use to reduce the dimensionality of data points.

Video

Slides

CS 533INTRO TO DATA SCIENCE Michael Ekstrand DECOMPOSING MATRICES Learning Outcomes Review matrix multiplication Decompose a matrix into a lower-rank approximation Photo by Carissa Weiser on Unsplash What Is a Matrix? Matrix Multiplication Sparse Matrix A matrix is sparse (mathematically) if most values are 0. Sparse matrix representations only store nonzero values scipy.sparse np.ndarray is our dense matrix DataFrame and Series cannot be sparse 😔 (they store 0s) Dimensionality Reduction Why? Compact representation Remove noise from original matrix Plot high-dimensional data to show relationships SVD preserves distance SVD can improve distance Find relationships between features Principle Component Analysis – find vectors of highest variance How? Principal Component Analysis Use Case 1: Compression & Denoising Use Case 2: Visualization Low-dimensional vectors can be visualized! See example notebooks Use Case 3: Better Neighborhoods High-dimensional spaces have 2 problems for distance: Distance more expensive to compute Points approach equidistant in high-dimensional space Decomposed matrices can improve this! k-NN classification k-means clustering Use Case 4: Categorical Interactions Wrapping Up Matrix decomposition (also called matrix factorization or dimensionality reduction) breaks a high-dimensional matrix into a low-dimensional one. It preserves distance and, in some configurations, finds the direction of maximum variance. Photo by Thomas Willmott on Unsplash

Resources

The next notebook
The PCADemo, demonstrating the PCA plots
TruncatedSVD
PCA

Movie Decomposition

The Movie Decomposition notebook demonstrates matrix decomposition with movie data.

Clustering

This video introduces the concept of clustering, another useful unsupervised learning technique.

Video

Slides

CS 533INTRO TO DATA SCIENCE Michael Ekstrand CLUSTERING Learning Outcomes Understand the idea of ‘clustering’ Interpret the results of clustering with k-means Photo by Markus Winkler on Unsplash Grouping Things Together What if we want to find groups in our data points? We don’t know the groups (or we would classify) Find them from the data This is clustering Membership Kinds Mixed-membership: point can be in more than one cluster Matrix factorization can be a kind of clustering Single-membership: point is in precisely one cluster Centroid-Based Clustering K-Means Algorithm Clustering in SKlearn KMeans class fit(X) learns cluster centers (can take y but will ignore) predict(X) maps data points to cluster numbers cluster_centers_ has cluster centers (in input space) Other clustering algorithms have similar interface. Evaluating Clusters Look at them Seriously. Look at them. If you have labels, compare Useful for understanding behavior Quality scores E.g. silhouette compares inter- and intra-cluster distances Can be used to compare clusterings, no absolute quality values Wrapping Up Clustering allows us to identify groups of items from the data. May or may not make sense. Cluster quality depends on features, metric, cluster count, and more. Photo by Igor Milicevic on Unsplash

Resources

KMeans

Clustering Example

The clustering example notebook shows how to use the KMeans class.

Vector Spaces

This video talks about vector spaces and transforms.

Video

Slides

CS 533INTRO TO DATA SCIENCE Michael Ekstrand VECTOR SPACES Learning Outcomes Introduce more formally the concept of a vector space Understand vector space transformations Photo by Markus Winkler on Unsplash Vector Spaces Vector Operations Addition (and subtraction) Scalar multiplication Inner products (sum of elementwise products) Distance (inner product of subtraction with itself) Matrix of Data Points What Is A Matrix? A collection of row vectors A collection of column vectors A linear map from one vector space to another A few matrix ops: Addition Multiplication (by scalar or compatible matrix or vector) Transpose Special Matrices Matrix-Vector Multiplication Transformations All by multiplying by a matrix: Reduce (or increase) dimensionality Translate Scale Skew Rotate Any linear transformation (this is actually what linear means) Linear Systems Wrapping Up Vectors represent data points in a vector space. These can be manipulated and transformed. Linear algebra teaches much more. Photo by Jurica Koletić on Unsplash

Practice: SVD on Paper Abstracts

The Week 13 Exercise notebook demonstrates latent semantic analysis on paper abstracts and has an exercise to classify text into new or old papers.

It requires the chi-papers.csv file, which is derived from the HCI Bibliography. It is the abstracts from papers published at the CHI conference (the primary conference for human-computer interaction) over a period of nearly 40 years.

If you want to see how to create this file, see the Fetch CHI Papers example.

Assignment 6

Assignment 6 is due November 22, 2020.