Logistic Regression applies the sigmoid function to the output of a linear equation:
ลท = ฯ(z) = 1 / (1 + e^โz) where z = wโ ยท xโ + b
The sigmoid (ฯ) maps any real number to the range (0, 1):
- z โ +โ : ฯ โ 1 (very confident class 1)
- z = 0 : ฯ = 0.5 (maximum uncertainty)
- z โ โโ : ฯ โ 0 (very confident class 0)
The output is interpreted as P(y=1 | x) โ the probability the input belongs to the positive class. We typically classify as 1 if ลท > 0.5.
Why sigmoid? It's the natural function that maps logits (log-odds) to probabilities. The logistic function has a beautiful property: it's differentiable everywhere, which makes gradient descent work smoothly.
The S-curve shape is the key intuition: flat near 0 and 1 (confident predictions), steep in the middle (uncertain region). The model becomes more confident as inputs move further from the decision boundary.
Deepening Notes
Source-backed reinforcement: these points are extracted from the session source note to strengthen your theory intuition.
- Now, let's take a look at the decision boundary to get a better sense of how logistic regression is computing these predictions.
- The logistic regression model will make predictions using this function f of x equals g of z, where z is now this expression over here, w1x1 plus w2x2 plus b, because we have two features x1 and x2.
- This line turns out to be the decision boundary, where if the features x are to the right of this line, logistic regression would predict 1 and to the left of this line, logistic regression with predicts 0.
- In other words, what we have just visualize is the decision boundary for logistic regression when the parameters w_1, w_2, and b are 1,1 and negative 3.
- We'll start by looking at the cost function for logistic regression and after that, figured out how to apply gradient descent to it.
Interview-Ready Deepening
Source-backed reinforcement: these points add detail beyond short-duration UI hints and emphasize production tradeoffs.
- The sigmoid function โ squashing any real number into a probability [0, 1].
- The logistic regression model will make predictions using this function f of x equals g of z, where z is now this expression over here, w1x1 plus w2x2 plus b, because we have two features x1 and x2.
- Logistic Regression applies the sigmoid function to the output of a linear equation:
- We'll start by looking at the cost function for logistic regression and after that, figured out how to apply gradient descent to it.
- The sigmoid (ฯ) maps any real number to the range (0, 1):
- Another way to write this is we can say f of x is equal to g, the Sigmoid function, also called the logistic function, applied to w.x plus b, where this is of course, the value of z.
- In other words, what we have just visualize is the decision boundary for logistic regression when the parameters w_1, w_2, and b are 1,1 and negative 3.
- This implementation of logistic regression will predict y equals 1 inside this shape and outside the shape will predict y equals 0.
Tradeoffs You Should Be Able to Explain
- More expressive models improve fit but can reduce interpretability and raise overfitting risk.
- Higher optimization speed can reduce training time but may increase instability if learning dynamics are not monitored.
- Feature-rich pipelines improve performance ceilings but increase maintenance and monitoring complexity.
First-time learner note: Read each model as a dataflow system: inputs become representations, representations become scores, and scores become decisions through a chosen loss and thresholding policy.
Production note: Track three things relentlessly in ML systems: data shape contracts, evaluation methodology, and the operational meaning of the model's errors. Most expensive failures come from one of those three.