Compared to some other machine learning algorithms, Logistic Regression will provide probability predictions and not only classification labels (think kNN).
Depending on your output needs this can be very useful if you'd like to have probability results especially if you want to integrate this implementation with another system that works on probability measures.
A good example is you might be after a "spam | no spam" classifier but you might want this to be adjustable based on a probability (similar to Google reCAPTCHA V3), in this case, having probabilities rather than only labels enables this project.
Bank loans can be another field where you want probability on the client rather than such a strict binary answer.
One of the great advantages of Logistic Regression is that when you have a complicated linear problem and not a whole lot of data it's still able to produce pretty useful predictions. This is a pro that comes with Logistic Regression's mathematical foundations and won't be possible with most other Machine Learning models.
Logistic Regression is not a resource hungry model (unlike many others, think NNs, SVM, kNN) and this makes it suitable for some simple applications.
Logistic Regression struggles to find real use case in real world problems because of how selective it is.
However, it's still respected and good to know. The leap from Linear Regression models to Logistic Regression was incredible when it was first invented. Today it's easy to understand especially if you have a technical background and it opens your mind how smart the idea was (and is) but I bet you it wasn't that easy to come up with when it was nonexistant.
So not really a practical advantage but at least for its place in history Logistic Regression is like a museum article you don't want to skip.
This doesn't mean it has absolutely no use case in the industry you'll just need very specific cases that it applies to.
Logistic Regression won't overfit easily as it's a linear model. Especially with the C regularization parameter in scikitlearn you can easily take control of any overfitting anxiety you might have.
Since Logistic Regression comes with a fast, resource friendly algorithm it scales pretty nicely. While many algorithms struggles with large datasets (such as SVMs, kNNs, sometimes Tree based models, etc.) Logistic Regression will scale very nicely and let you harvest your millions of rows without your hair losing its original color, oh wait, unless its original color is white! Anyway I think you get the point.
Inside the borders of linearity, Logistic Regression actually has some nice fitting flexibility. By using the regularization parameter one can apply different regularization techniques to Logistic Regression to reduce the error in the model or fine tune the fitting.
Lasso, Ridge or Elasticnet regularization models can be applied in this sense. Regularization will make Logistic Regression behave more similarly to Naive Bayes in the sense that, it will become a more generalist model and tend to avoid noise and outliers.
Holy Python is reader-supported. When you buy through links on our site, we may earn an affiliate commission.
Logistic Regression is still prone to overfitting, although less likely than some other models. To avoid this tendency a larger training data and regularization can be introduced.
Just as no regularization can be a con, regularization can be a con too. High necessity of regularization in Logistic Regression means just a few more parameters to optimize, advanced topics to dive in and cross validation to carry out (Life of a modern human! Who can relate?).
Logistic Regression is strictly a classification method and it has lots of competition. (SVMs, Naive Bayes, Random Forests, kNN etc.)
Logistic Regression inherently runs on a linear model. This means even more restriction when it comes to implementing logistic regression.
If you have a non-linear problem in hand you'll have to look for another model but no worries, there are plenty. (think Naive Bayes, SVM, kNN)
Data preparation can be tedious in Logistic Regression as both scaling and normalization are important requirements of Logistic Regression.
Logistic Regression is not immune to missing data unlike some other machine learning models such as decision trees and random forests which are based on trees.
This usually means extra work on data regarding processing missing values.
Sometimes plain results just won't cut it. You'll want to hear the reasons behind. Logistic Regression's probability calculations are very welcome in those cases.
Logistic Regression is not as computationally costly as most other models
Logistic Regression's scalability means
It's just not so common to come across linear decision boundary problems that require Machine Learning implementation especially if we also look for feature independence.
Logistic Regression doesn't require tons of data to get smart. It can produce good results with small data when others can't.
Normalization and Scaling are realities of Logistic Regression. On top of that you will have to take care of missing values in the data.