Naive Bayes Optimization

These are the most commonly adjusted parameters with different Naive Bayes Algorithms. Let’s take a deeper look at what they are used for and how to change their values:

Gaussian Naive Bayes Parameters:

Parameters for: Multinomial Naive Bayes, Complement Naive Bayes, Bernoulli Naive Bayes, Categorical Naive Bayes

priors: Concerning the prior class probabilities, when priors are provided (in an array) they won’t be adjusted based on the dataset.

var_smoothing: (default 1e-9)Concerning variance smoothing, float value provided will be used to calculate the largest variances of each feature and add it to the stability calculation variance

alpha: (default 1.0) Another smoothing parameter alpha can be used for Laplace Lidstone smoothing in various Naive Bayes Algorithms.

0: No smoothing will be applied

float: Smoothing will be applied at the amount of float assigned.

fit_prior: (default: True)

True: Prior probabilities for classes will be learned.

False: A uniform prior will be used for class prior probabilities.

class_prior: (default: None) Refers to class prior probabilities.

None: Priors will be adjusted based on the dataset.

Array: Priors will have pre-defined class probabilities and won’t be adjusted based on the data.

Examples:

from sklearn.naive_bayes import GaussianNB
GNB = GaussianNB(var_smoothing=2e-9)

from sklearn.naive_bayes import MultinomialNB
MNB = MultinomialNB(alpha=0.6)

from sklearn.naive_bayes import BernoulliNB
BNB = BernoulliNB(fit_prior = False)

from sklearn.naive_bayes import ComplementNB
CNB = ComplementNB(norm = True)

More parameters

More Naive Bayes Parameters for fine tuning

Further on, these parameters can be used for further optimization, to avoid overfitting and make adjustments based on impurity:

binarize
norm
metric

binarize

(default: 0.0)

This parameter only applies to Bernoulli Naive Bayes Algorithm.

float: Sample features will be binarized based on this threshold value.

None: Sample features will be assumed to be binarized values already. (mapped to booleans)

norm

(default: False)

This parameter only applies to Complement Naive Bayes Algorithm.

A parameter concerning Complement Naive Bayes Algorithm, norm represents performing of second "weights normalization"

False: Second normalization won't be performed (parallel to Weka and Mahout implementations).

True: Second normalization will be implemented.

Official Scikit Learn Documentation: sklearn.naive_bayes.GaussianNB