Supervised Learning Algorithms
Transcript-backed ML fundamentals: linear regression, logistic regression, gradient descent, feature scaling, overfitting, and regularization.
Concepts Covered
Supervised Learning Algorithms
0/39 doneIntroduction to Machine Learning
What ML is, where you already use it daily, and why this matters.
Why Machine Learning Matters
ML as the dominant path to AI; the $13-trillion opportunity ahead.
ML Definition & Types
Supervised, unsupervised, and reinforcement learning — when to use each.
Supervised Learning — Regression
Predicting continuous output values — the engine behind 99% of ML's economic value.
Supervised Learning — Classification
Predicting discrete categories rather than continuous values.
Unsupervised Learning
Finding hidden structure in data with no labels — clustering, anomaly detection, and more.
Unsupervised — Anomaly Detection
Detecting fraud, defects, and outliers — the three types of unsupervised learning.
Jupyter Labs & Dev Environment
The industry-standard ML environment — the exact same tool used at Google, Meta, and Amazon.
Linear Regression Pipeline
Your first supervised learning model — probably the most widely used ML algorithm in the world.
The Supervised Learning Pipeline
How supervised learning actually works end-to-end — training set in, function out.
Cost Function
Measuring how wrong your model is — Mean Squared Error (MSE) explained.
Cost Function Intuition
What the cost function looks like — and why the bowl shape matters.
Cost Visualisation in 3D
Contour plots and the 3D bowl — seeing the optimisation landscape with two parameters.
Parameters, Model & Cost — Together
Connecting the model line, cost function, and contour plot into one unified picture.
Gradient Descent — Concept
The core optimisation algorithm that trains virtually every ML model.
Gradient Descent — Update Rule
The actual update equations — the math behind every gradient step.
Derivative Intuition for Gradient Descent
The tangent line trick — why the sign and magnitude of the gradient guide every step.
Learning Rate
The most critical hyperparameter — too large diverges, too small barely moves.
Completing Linear Regression
The complete training loop: model + cost + gradient derivation all in one.
Gradient Descent — Live Demo
Watching the algorithm actually run — the parameter trajectory toward the minimum.
Multiple Linear Regression
Extending to many features simultaneously — the vectorised dot product form.
Vectorisation
Why vectorised code is 100× faster — numpy and hardware parallelism.
Vectorisation — Under the Hood
How NumPy, BLAS, and GPU kernels actually execute computations in parallel.
Feature Scaling
Normalising features so gradient descent converges faster — a must-do step.
Implementing Feature Scaling
Coding z-score normalisation from scratch; using sklearn's StandardScaler.
Gradient Descent Convergence
The learning curve — how to tell when training is done and when it's broken.
Choosing the Learning Rate
The log-scale sweep strategy for finding a good α systematically.
Feature Engineering
Creating better input features using domain knowledge — often the biggest performance lever.
Polynomial Regression
Fitting curves not just lines — by engineering x², x³ as new features.
Classification — Deep Dive
Why linear regression fails for classification and what to use instead.
Logistic Regression
The sigmoid function — squashing any real number into a probability [0, 1].
Decision Boundary
Where the model draws the line between classes — linear and non-linear boundaries.
Logistic Regression — Cost Function
Why MSE creates non-convex surfaces for classification; introducing log loss.
Simplified Logistic Loss
Combining the y=0 and y=1 cases into one elegant unified formula.
Gradient Descent for Logistic Regression
Same update rule as linear regression — but with sigmoid applied underneath.
Overfitting & Underfitting
The bias-variance tradeoff — the single most important concept in applied ML.
Regularisation — Concept
Adding a penalty for large weights — the elegant way to prevent overfitting.
Regularisation — Math for Linear Regression
L2 penalty added to MSE; weight decay in the gradient update.
Regularised Logistic Regression
Applying L2 regularisation to logistic regression — the production standard.