A29: Logistic Regression (Part-1)>> Theory Slides/lecture!

Classification problem, standard logistic function, confusion matrix, True positive/negative, accuracy, specificity, precision, error /miss-classification rate…!

6 min readJan 7, 2022

This article is a part of “Data Science from Scratch — Can I to I Can”, A Lecture Notes Book Series. (click here to get your copy today!)
Click here for the previous article/lecture on “A28: Dummy Variables >> Dealing with Categorical Features!!”

💐💐Click here to FOLLOW ME for new contents💐💐

Hi guys,

It’s time to learn Logistic Regression. Let’s talk about our first and the most common classification algorithm.

In this section, we will discuss:

Classification problem and the use of Logistic Regression to find the answer to our problem.
How to interpret the results from Logistic Regression through the confusion matrix.

In classification problem, rather than predicting a continuous or quantitive output value (e.g today’s stock price, house price etc), we are interested in non-numerical value, a categorical or qualitative output (e.g. will the stock index increase of decrease). In this section, our focus will be to learn Logistic Regression as a method for Classification. Today, Logistic Regression model is one of the most widely used machine learning algorithm for binary classification problems.

Common binary classification problems, you can think about several others, including multi-class classification problems….!

Linear regression is not appropriate in the case of a qualitative (classification problem) response. Why not? Let’s learn with a very common example!

A good read: The Regression Analysis of Binary Sequences by David R. Cox in Journal of the Royal Statistical Society. Vol. 20, №2 (1958), pp 215–242.

Let’s try to understand with a simple example:

We want to predict the probability of default based on the customer’s outstanding balance of their credit cards.

What if we use the linear regression (as shown in the plot below)?

Linear regression line to predict probability of default, will this work for you? (recreated plot >> adopted from ISLR p-133)

Well, in the plot above, we get some negative values of the estimated probability of default for balances close to zero. On the other hand, what if the balance is very large, just extrapolate your Linear Regression line and you would get a value much bigger than 1 of your estimated probability of default!

Recall you knowledge of probability, regardless of your feature(s) X , your probability of default must be between 0 & 1, right!

Such situation (as shown in the above plot) appears any time when we want to fit a straight line to a binary target (0/1 or No/Yes). Strictly speaking, unless the range of X is limited, we can always get probabilities which are less than 0 and greater than 1 for some values of X. Now the problem is, we want the probabilities, p(X), between 0 and 1, and this is where logistic function come to the scene in Logistic Regression.

In the next plot, we have Logistic Regression fitted to the same data. Now, for low and high balances, our model will predict p(X) as close to but never below 0 and above 1 respectively.

Same data but fitted using logistic regression, is this acceptable? >> (recreated plot >> adopted from ISLR p-133)

The logistic function is a sigmoid function that will always produce an S-shaped curve and regardless of the value of feature(s) “X”, we will get a reasonable prediction for our target class (0/1).

So, the Logistic Regression model is better in capturing the range of probabilities as compared to the Linear Regression model in classification problem.

Once, we train our Logistic Regression model on the train dataset, we test the performance on the test dataset, which is unseen to the model.

Now, the most important thing is to evaluate the performance of our model to see how well the predictions are!

We can use Confusion Matrix to evaluate Classification models.

Good to know: By analogy to the Multiple Linear Regression, if we use multiple features in Logistic Regression it would be Multiple Logistic Regression.

Confusion matrix: (Model Evaluation)

(also known as error matrix) is often used to describe the performance of a classification model on a set of test data for which the true values are known. Let’s try to learn while testing number of persons for some disease or may be for the defaults, you can think about any binary classification problem at the moment!

Let’s define the basic terminology:

• True Negatives (TN): Our model predicted No, and they don’t have the disease (correct).

• True Positives (TP): Our model predicted Yes, and they do have the disease (correct).

• False Positives (FP): Our model predicted Yes, but they don’t actually have the disease (wrong prediction — known as a “Type I error”).

• False Negatives (FN): Our model predicted No, but they actually do have the disease (wrong prediction — known as a “Type II error”).

Confusion matrix >> let’s use the numbers to find accuracy, error rate, specificity, precision…..etc.!

Accuracy: Overall, how often our model predicted correct?

Accuracy = correct predictions / total
Accuracy = (TN + TP) / total = 90 / 100 = 0.90

Misclassification Rate / Error Rate: Overall, how often our model predicted wrong?

Error Rate = wrong predictions / total
Error Rate = (FP + FN) / total = 10 / 100 = 0.10

Specificity: When it’s actually No, how often does the model predicts No?

Specificity = TN / actual No = 40 / 43 = 0.93

Precision: When it predicts yes, how often the model is correct?

Precision = TP / predicted Yes = 50 / 53 = 0.94

Confusion matrix is very important in model evaluation for classification algorithms. I hope it would be mush easier to memorize confusion matrix and types of errors with this comic.

This may help you to understand Type I and Type II error! (image source)

I would keep this lecture very simple and finish is here.

In the next lecture:

We will talk about more important concepts including logit-link function, probabilities vs odds, interpretation of logistic regression coefficients, baseline accuracy, accuracy paradox, F-score, power analysis ……. and much more.
We will use code to get plots, use the data and try to understand logistic regression in further details. In the mean time, this bird-eye overview will be helpful…..!

💐💐Click here to FOLLOW ME for new contents💐💐
🌹Keep practicing to brush-up & add new skills🌹

✅🌹💐💐💐🌹✅ Please clap and share >> you can help us to reach to someone who is struggling to learn these concepts.✅🌹💐💐💐🌹✅

Good luck!

See you in the next lecture on “A30: Logistic Regression (Part-2)>>Behind the Scene”.
Note: This complete course, including video lectures and jupyter notebooks, is available on the following links:

About Dr. Junaid Qazi:

Dr. Junaid Qazi is a subject matter specialist, data science & machine learning consultant, and a team builder. He is a professional development coach, mentor, author, technical writer, and invited speaker. Dr. Qazi can be reached for consulting projects, technical writing and/or professional development trainings via LinkedIn.

A29: Logistic Regression (Part-1)>> Theory Slides/lecture!

Classification problem, standard logistic function, confusion matrix, True positive/negative, accuracy, specificity, precision, error /miss-classification rate…!

Written by Junaid Qazi, PhD

No responses yet