Hello everybody,
Michael here, and hope you all had a wonderful holiday season. I’ve got lots of exciting content planned for 2023-including something special for the blog’s 5th anniversary (yup, this blog turns 5 on June 13)-and I hope you all will follow along on this programming journey.
To start the year, I thought I’d pick up where I left off in 2022. If you recall, the last post I wrote in 2022 involved creating a basic neural network in Python using the famous MNIST dataset-Python Lesson 38: Building Your First Neural Network (AI pt. 2). In that post, you’ll also likely recall that the neural network we built had an accuracy of less than 20%. In this post, we’ll explore a simple way to improve that neural network’s accuracy. Let’s get coding!
A little refresher on our previous project
In case you’d like to see it again, here’s our code for the neural network project we made in the previous post:
import tensorflow as tf
import keras as kr
import tensorflow_datasets as tfds
(trainX, trainY), (testX, testY) = mnist.load_data()
trainX.shape
testX.shape
trainY.shape
testY.shape
import matplotlib.pyplot as plt
imageNum = 1500
plt.imshow(trainX[imageNum], cmap='magma')
import matplotlib.pyplot as plt
imageNum = 3332
plt.imshow(testX[imageNum], cmap='magma')
firstNeuralNetwork = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(150, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
firstNeuralNetwork.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
firstNeuralNetwork.fit(x=trainX,y=trainY, epochs=25)
firstNeuralNetwork.evaluate(testX, testY)
To recap, in this code, we built a basic neural network in Python to classify handwritten digits in the MNIST dataset and as I mentioned earlier, this model wasn’t very accurate. In fact, we didn’t acheive accuracy higher than 20% through any of the iterations. Let’s explore some ways we can change that.
One simple way to improve the neural network’s accuracy
Pay attention to this line of code-it creates the second Dense layer in our neural network (the layer that must have ten neurons in this example):
tf.keras.layers.Dense(10)
Similar to what we did for the first Dense layer, add an activation parameter when creating this dense layer (after the number 10). However, this time, set the value of the activation parameter to softmax, like so:
tf.keras.layers.Dense(10, activation='softmax')
You’re likely wondering, what is the softmax function? Here’s an easy way to explain it. Imagine you’re arranging a summertime trip and have four choices of departure dates-June 30, July 1, July 3, and July 5. Let’s say you wanted to use the softmax function to decide a departure date.
The way the softmax function works is that it takes the four aforementioned dates and assigns random probabilities to each of them-the sum of these four probabilites will equal 1 (essentially, we’re dividing the group of possible departure dates into four parts). In this example, let’s say the four probabilities assigned were 46% (for June 30), 20% (for July 1), 19% (for July 3), and 15% (for July 5). All of these probabilites add up to 1-or 100%.
Now that we’ve explained the softmax function, let’s see how it helps improve our neural networks accuracy without changing anything else in the code.
First, let’s see how the accuracy for each epoch is affected:
firstNeuralNetwork.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
firstNeuralNetwork.fit(x=trainX,y=trainY, epochs=25)
Epoch 1/25
1875/1875 [==============================] - 5s 2ms/step - loss: 2.4216 - accuracy: 0.7781
Epoch 2/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.5508 - accuracy: 0.8612
Epoch 3/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.4453 - accuracy: 0.8877
Epoch 4/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.3841 - accuracy: 0.9004
Epoch 5/25
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3730 - accuracy: 0.9069
Epoch 6/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.3482 - accuracy: 0.9123
Epoch 7/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.3343 - accuracy: 0.9167
Epoch 8/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.3250 - accuracy: 0.9178
Epoch 9/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.3182 - accuracy: 0.9224
Epoch 10/25
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3103 - accuracy: 0.9238
Epoch 11/25
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3041 - accuracy: 0.9251
Epoch 12/25
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3022 - accuracy: 0.9258
Epoch 13/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2983 - accuracy: 0.9280
Epoch 14/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2962 - accuracy: 0.9288
Epoch 15/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2832 - accuracy: 0.9320
Epoch 16/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2904 - accuracy: 0.9321
Epoch 17/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2861 - accuracy: 0.9308
Epoch 18/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2805 - accuracy: 0.9337
Epoch 19/25
1875/1875 [==============================] - 5s 2ms/step - loss: 0.2859 - accuracy: 0.9334
Epoch 20/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2775 - accuracy: 0.9365
Epoch 21/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2800 - accuracy: 0.9346
Epoch 22/25
1875/1875 [==============================] - 4s 2ms/step - loss: 0.2825 - accuracy: 0.9371
Epoch 23/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2743 - accuracy: 0.9370
Epoch 24/25
1875/1875 [==============================] - 6s 3ms/step - loss: 0.2749 - accuracy: 0.9383
Epoch 25/25
1875/1875 [==============================] - 5s 3ms/step - loss: 0.2703 - accuracy: 0.9372
Well, that’s a significant improvement from the per-epoch accuracy from the previous post! I mean, 77.8% accuracy on just the first epoch is quite impressive-and by the 25th and last epoch-the model achieves 93.7% accuracy.
Now, let’s check out the overall accuracy of the model:
firstNeuralNetwork.evaluate(testX, testY)
313/313 [==============================] - 1s 2ms/step - loss: 0.4758 - accuracy: 0.9471
94.7% overall accruacy-all by adding a simple line of code! If you recall from the previous post, our model’s overall accuracy was just 10.5%.
Thanks for reading, and I can’t wait to share all of the exiciting programming content I have planned for you all in 2023!
Also, if there’s anything you can take away from this lesson, it’s that sometimes the smallest code changes can make a big difference in your program.