Polynomial regression captures curved patterns by expanding input features (x, x^2, x^3, ...).
ŷ = w₁x + w₂x² + w₃x³ + b
Even with this curve, the algorithm is still linear regression in parameter space. You changed the features, not the optimizer.
Core design tradeoff: increasing degree raises expressiveness but also variance. Low degree underfits; very high degree memorizes noise and becomes unstable outside the training range.
Practical constraints:
- Always scale polynomial features; magnitudes explode rapidly.
- Use validation curves to select degree, not intuition alone.
- Regularisation (L2/L1) is often required as degree grows.
- Avoid extrapolation promises; high-degree polynomials can behave wildly beyond observed x-range.
Modeling mindset: polynomial terms are one option, not default. Choose feature forms that reflect domain behavior (diminishing returns, saturation, thresholds) rather than blindly increasing degree.
Deepening Notes
Source-backed reinforcement: these points are extracted from the session source note to strengthen your theory intuition.
- It turns out that linear regression is not a good algorithm for classification problems.
- Let's take a look at why and this will lead us into a different algorithm called logistic regression.
- This type of classification problem where there are only two possible outputs is called binary classification.
- You learn more about the decision boundary in the next video, you also learn about an algorithm called logistic regression.
- Which is okay because that motivates the need for a different model to do classification talks.
Interview-Ready Deepening
Source-backed reinforcement: these points add detail beyond short-duration UI hints and emphasize production tradeoffs.
- Fitting curves not just lines — by engineering x², x³ as new features.
- Polynomial regression captures curved patterns by expanding input features (x, x^2, x^3, ...).
- Linear regression predicts not just the values zero and one.
- Avoid extrapolation promises; high-degree polynomials can behave wildly beyond observed x-range.
- It turns out that linear regression is not a good algorithm for classification problems.
- Which is okay because that motivates the need for a different model to do classification talks.
- Even with this curve, the algorithm is still linear regression in parameter space.
- Core design tradeoff: increasing degree raises expressiveness but also variance.
Tradeoffs You Should Be Able to Explain
- More expressive models improve fit but can reduce interpretability and raise overfitting risk.
- Higher optimization speed can reduce training time but may increase instability if learning dynamics are not monitored.
- Feature-rich pipelines improve performance ceilings but increase maintenance and monitoring complexity.
First-time learner note: Read each model as a dataflow system: inputs become representations, representations become scores, and scores become decisions through a chosen loss and thresholding policy.
Production note: Track three things relentlessly in ML systems: data shape contracts, evaluation methodology, and the operational meaning of the model's errors. Most expensive failures come from one of those three.