Skip to main content

Multinomial NB example on MNIST Dataset

· 3 min read
Shaurya Singhal

Source: View original notebook on GitHub

Category: Machine Learning / Learn ML

Multinomial NB example on MNIST Dataset

from sklearn.datasets import load_digits
mnist = load_digits()
X = mnist.data
Y = mnist.target
mnist.target_names # total possible digits

Output:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
X.shape # 64 features per digit image of size 8X8

Output:

(1797, 64)
Y.shape

Output:

(1797,)
import matplotlib.pyplot as plt
plt.imshow(X[1].reshape(8,8),cmap='gray')
plt.show()

Output

Y[1]

Output:

1
X[0] # values are in range 0(000) to 15(111) -> shades of grey !

Output:

array([ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.,  0.,  0., 13., 15., 10.,
15., 5., 0., 0., 3., 15., 2., 0., 11., 8., 0., 0., 4.,
12., 0., 0., 8., 8., 0., 0., 5., 8., 0., 0., 9., 8.,
0., 0., 4., 11., 0., 1., 12., 7., 0., 0., 2., 14., 5.,
10., 12., 0., 0., 0., 0., 6., 13., 10., 0., 0., 0.])
# train test split

from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.25, random_state = 101)
X_train.shape, X_test.shape

Output:

((1347, 64), (450, 64))

Applying Multinomial Naive Bayes (as each feature of image has 16 different values)

from sklearn.naive_bayes import MultinomialNB
mnb = MultinomialNB()
mnb

Output:

MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
mnb.fit(X_train,Y_train)

Output:

MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
mnb.classes_

Output:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
mnb.predict_proba(X_test[0].reshape((1,-1)))

Output:

array([[1.15456271e-130, 6.13092874e-060, 3.24884657e-092,
3.31212248e-075, 5.19568172e-064, 1.72442368e-056,
1.20017925e-144, 1.00000000e+000, 1.40557595e-055,
1.73407453e-055]])
Y_test[0]

Output:

7
mnb.predict(X_test[0].reshape((1,-1)))

Output:

array([7])

Accuracy

Y_pred = mnb.predict(X_test)
mnb.score(X_test,Y_test)

Output:

0.9155555555555556
import numpy as np
np.sum(Y_pred == Y_test) / X_test.shape[0]

Output:

0.9155555555555556

Applyting Bernoulli Naive bayes( converting 16 differnent values to 2 values using binarizer)

from sklearn.naive_bayes import BernoulliNB
bnb = BernoulliNB(binarize = 7 ) # if values are below or equal to 7 all changes to 0 else 1
bnb.fit(X_train,Y_train)

Output:

BernoulliNB(alpha=1.0, binarize=7, class_prior=None, fit_prior=True)
bnb.classes_

Output:

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
bnb.predict_proba(X_test[0].reshape((1,-1)))

Output:

array([[1.85725429e-16, 3.84837258e-10, 2.38737848e-12, 5.19525179e-09,
2.34463989e-10, 1.10719034e-10, 1.39067356e-19, 9.99999672e-01,
4.15975813e-08, 2.80422953e-07]])
Y_test[0]

Output:

7
bnb.predict(X_test[0].reshape((1,-1)))

Output:

array([7])

Accuracy(less than mutinomial)

Y_pred = bnb.predict(X_test)
bnb.score(X_test,Y_test)

Output:

0.8888888888888888
import numpy as np
np.sum(Y_pred == Y_test) / X_test.shape[0]

Output:

0.8888888888888888