Unsupervised, Recommenders & Reinforcement
Clustering and anomaly detection foundations from the URRL sequence, with deep notes and interactive labs.
Concepts Covered
Unsupervised, Recommenders & Reinforcement
0/13 doneWelcome
Course map: unsupervised learning first, recommender systems next, reinforcement learning after that.
What Is Clustering?
Clustering finds structure in unlabeled data by grouping similar points together.
K-Means Intuition
K-means alternates between assigning points to nearest centroids and moving centroids to cluster means.
K-Means Algorithm
Formal K-means procedure with assignment equations, centroid updates, and empty-cluster handling.
Optimization Objective
K-means minimizes distortion: average squared distance from each point to its assigned centroid.
Initializing K-Means
Initialization quality strongly affects final clustering; multi-start runs improve robustness.
Choosing the Number of Clusters
Choosing K is often ambiguous; combine elbow hints with downstream business tradeoffs.
Finding Unusual Events
Anomaly detection learns normal behavior and flags low-probability events for inspection.
Gaussian (Normal) Distribution
Gaussian distributions model feature likelihood via mean and variance, forming the basis of simple anomaly scoring.
Anomaly Detection Algorithm
Fit one Gaussian per feature, multiply densities into p(x), then classify with epsilon threshold.
Developing and Evaluating an Anomaly Detection System
Use cross-validation anomalies to tune epsilon and features; evaluate with skew-aware metrics like precision, recall, and F1.
Anomaly Detection vs Supervised Learning
Pick anomaly detection for rare and evolving positives; pick supervised learning when positives are sufficiently labeled and stable.
Choosing What Features to Use
Feature shaping and engineering are critical in anomaly detection; transform skewed variables and iterate via error analysis.