Classification asks for category decisions, not unconstrained numeric values. That is why plain linear regression is structurally wrong for binary tasks.
Failure modes of linear regression for classification:
- Predictions can be less than 0 or greater than 1, so they are not valid probabilities.
- Decision behavior is fragile under outliers; one extreme point can move the boundary too much.
- Error objective is misaligned with probabilistic classification goals.
These issues motivate logistic regression, which maps logits to probabilities in [0,1] and supports principled thresholding.
Classification vs regression:
- Regression outputs continuous quantities.
- Classification outputs one class from a finite set.
Important production concepts introduced here: class imbalance, threshold tuning, and cost-sensitive decisions. The best threshold is rarely 0.5 when false positives and false negatives have different business costs.
Deepening Notes
Source-backed reinforcement: these points are extracted from the session source note to strengthen your theory intuition.
- Here's a graph of the dataset where the horizontal axis is the tumor size and the vertical axis takes on only values of 0 and 1, because is a classification problem.
- To build out to the logistic regression algorithm, there's an important mathematical function I like to describe which is called the Sigmoid function, sometimes also referred to as the logistic function.
- This is the logistic regression model, and what it does is it inputs feature or set of features X and outputs a number between 0 and 1.
- If someday you read research papers or blog pulls of all logistic regression, sometimes you see this notation that f of x is equal to p of y equals 1 given the input features x and with parameters w and b.
- This will give you a few different ways to map the numbers that this model outputs, such as 0.3, or 0.7, or 0.65 to a prediction of whether y is actually 0 or 1.
Interview-Ready Deepening
Source-backed reinforcement: these points add detail beyond short-duration UI hints and emphasize production tradeoffs.
- Why linear regression fails for classification and what to use instead.
- Here's a graph of the dataset where the horizontal axis is the tumor size and the vertical axis takes on only values of 0 and 1, because is a classification problem.
- This is the logistic regression model, and what it does is it inputs feature or set of features X and outputs a number between 0 and 1.
- That is why plain linear regression is structurally wrong for binary tasks.
- These issues motivate logistic regression, which maps logits to probabilities in [0,1] and supports principled thresholding.
- Predictions can be less than 0 or greater than 1, so they are not valid probabilities.
- Error objective is misaligned with probabilistic classification goals.
- Classification outputs one class from a finite set.
Tradeoffs You Should Be Able to Explain
- More expressive models improve fit but can reduce interpretability and raise overfitting risk.
- Higher optimization speed can reduce training time but may increase instability if learning dynamics are not monitored.
- Feature-rich pipelines improve performance ceilings but increase maintenance and monitoring complexity.
First-time learner note: Read each model as a dataflow system: inputs become representations, representations become scores, and scores become decisions through a chosen loss and thresholding policy.
Production note: Track three things relentlessly in ML systems: data shape contracts, evaluation methodology, and the operational meaning of the model's errors. Most expensive failures come from one of those three.