Guided Starter Example
Model predicts house prices [300K, 400K, 500K]. True prices are [280K, 420K, 480K]. MSE = average of [(20K)², (20K)², (20K)²] / 2 = single 'wrongness' score to minimise.
Measuring how wrong your model is — Mean Squared Error (MSE) explained.
The cost function (also called loss function) answers the question: 'How wrong is my model right now?'
It distils the model's performance on all training examples into a single number. The goal of training is to find the parameters (w, b) that make this number as small as possible.
Mean Squared Error (MSE) for linear regression:
J(w,b) = (1/2m) × Σ (ŷᵢ − yᵢ)²
Breaking this down step by step:
Intuition: If the model predicts house prices perfectly for every example, J = 0. The worse the predictions, the higher J climbs. Training is the process of making J as close to 0 as possible.
Why square errors? Two reasons: (1) negative and positive errors don't cancel each other out, and (2) large errors get penalised much more heavily than small ones — a prediction that's off by 10 contributes 100 to the cost, while being off by 1 contributes only 1.
Source-backed reinforcement: these points are extracted from the session source note to strengthen your theory intuition.
Source-backed reinforcement: these points add detail beyond short-duration UI hints and emphasize production tradeoffs.
First-time learner note: Read each model as a dataflow system: inputs become representations, representations become scores, and scores become decisions through a chosen loss and thresholding policy.
Production note: Track three things relentlessly in ML systems: data shape contracts, evaluation methodology, and the operational meaning of the model's errors. Most expensive failures come from one of those three.
Exhaustive coverage points to ensure complete topic understanding without missing core concepts.
Model predicts house prices [300K, 400K, 500K]. True prices are [280K, 420K, 480K]. MSE = average of [(20K)², (20K)², (20K)²] / 2 = single 'wrongness' score to minimise.
Guided Starter Example
Model predicts house prices [300K, 400K, 500K]. True prices are [280K, 420K, 480K]. MSE = average of [(20K)², (20K)², (20K)²] / 2 = single 'wrongness' score to minimise.
Source-grounded Practical Scenario
To build a cost function that doesn't automatically get bigger as the training set size gets larger by convention, we will compute the average squared error instead of the total squared error and we do that by dividing by m like this.
Source-grounded Practical Scenario
Model predicts house prices [300K, 400K, 500K]. True prices are [280K, 420K, 480K]. MSE = average of [(20K)², (20K)², (20K)²] / 2 = single 'wrongness' score to minimise.
Concept-to-code walkthrough checklist for this topic.
Questions an interviewer is likely to ask about this topic. Think through your answer before reading the senior angle.
Test yourself before moving on. Flip each card to check your understanding — great for quick revision before an interview.
Drag to reorder the architecture flow for Cost Function. This is designed as an interview rehearsal for explaining end-to-end execution.
Start flipping cards to track your progress
What is a cost function (a.k.a. loss function) in ML?
tap to reveal →A single number that measures how wrong the model's predictions are across all training examples. Training = finding parameters that minimise this number. Think of it as the model's 'score on wrongness'.