Logistic Regression Optimization
Logistic Regression Optimization Parameters Explained
These are the most commonly adjusted parameters with Logistic Regression. Let’s take a deeper look at what they are used for and how to change their values:
penalty: (default: “l2“) Defines penalization norms. Certain solver objects support only specific penalization parameters so that should be taken into consideration.
l1: penalty supported by liblinear and saga solvers
l2: penalty supported by cg, sag, saga, lbfgs solvers.
elasticnet: penalty only supported by: saga solver.
none: Penalty regularization won’t be applied. Doesn’t work with liblinear solver.
solver: (default: “lbfgs“) Provides options to choose solver algorithm for optimization. Usually default solver works great in most situations and there are suggestions for specific occasions below such as: classification problems with large or very large datasets.
If you have particular cases it’s always a good idea to monitor how solver is working on training and test data by comparing different solver functions. This can also help understand the finesse of different solvers a very interesting topic.
lbfgs: Stands for limited-memory BFGS. This solver only calculates an approximation to the Hessian based on the gradient which makes it computationally more effective. On the other hand it’s memory usage is limited compared to regular bfgs which causes it to discard earlier gradients and accumulate only fresh gradients as allowed by the memory restriction.
liblinear: More efficient solver with small datasets. Only useful for ovr (one-versus-rest) problems won’t work with multiclass problems unlike other solvers here. Also doesn’t work with l2 or none parameter values for penalty.
newton-cg: Solver which calculates Hessian explicitly which can be computationally expensive in high dimensions.
sag: Stands for Stochastic Average Gradient Descent. More efficient solver with large datasets.
saga: Saga is a variant of Sag and it can be used with l1 Regularization. It’s a quite time-efficient solver and usually the go-to solver with very large datasets.
dual: (default: False)
Dual or primal formulation. Dual formulation is only implemented for l2 penalty with liblinear solver. Prefer dual=False when n_samples > n_features.
tol: (default: 0.0004) This parameter stands for stopping criteria tolerance.
C: (default: 1.0) This parameter signifies strength of the regularization and takes a positive float value. C and regularization strength are negatively correlated (smaller the C is stronger the regularization will be).
fit_intercept: (default: True) Concerning decision function, regulates if a constant should be added.
random_state: (default: None) Adjusts randomness seed.
none: seed will be numpy’s random module: numpy.random
int: seed will be generated based on integer value by random number generator
RandomState: random_state will be the random number generator (seed)
from sklearn.linear_model import LogisticRegression LRM = LogisticRegression(solver="saga", penalty="elasticnet")
LRM = LogisticRegression(tol = 0.0009)
LRM = LogisticRegression(fit_intercept = True)
LRM = LogisticRegression(verbose = 2)
LRM = LogisticRegression(warm_start = True)
More Logistic Regression Optimization Parameters for fine tuning
Further on, these parameters can be used for further optimization, to avoid overfitting and make adjustments based on impurity:
BFGS Solver: stands for Broyden–Fletcher–Goldfarb–Shanno
LBFGS Solver: stands for Limited Broyden–Fletcher–Goldfarb–Shanno
1- Broyden, C. G. (1970), “The convergence of a class of double-rank minimization algorithms“
2- Fletcher, R. (1970), “A New Approach to Variable Metric Algorithms“
3- Goldfarb, D. (1970), “A Family of Variable Metric Updates Derived by Variational Means“
4- Shanno, David F. (July 1970), “Conditioning of quasi-Newton methods for function minimization“
5- Fletcher, Roger (1987), “Practical methods of optimization (2nd edition)“