The digit images itself can be downloaded through Keras API — you might have noticed this when we imported the libraries. After running the code below all images are going to be stored in X_train and X_test, while the ground truths (or labels) are stored in both y arrays.
(X_train, y_train), (X_test, y_test) = mnist.load_data()
Now if you wanna see how the images look like, we can just run the following code. Here I decided to show images at index 120 to 129 taken from X_train array. The output should look something like the image that I showed earlier.
1. Fundamentals of AI, ML and Deep Learning for Product Managers
2. The Unfortunate Power of Deep Learning
3. Graph Neural Network for 3D Object Detection in a Point Cloud
4. Know the biggest Notable difference between AI vs. Machine Learning
fig, axes = plt.subplots(ncols=10, sharex=False,
sharey=True, figsize=(20, 7))
counter = 0
for i in range(120, 130):
axes[counter].set_title(y_train[i])
axes[counter].imshow(X_train[i], cmap='gray')
axes[counter].get_xaxis().set_visible(False)
axes[counter].get_yaxis().set_visible(False)
counter += 1
plt.show()
Well, so far we haven’t actually doing any kind of preprocessing stuff. The first thing to do now is to normalize the values which represent the brightness of each pixels, such that those numbers are going to lie within the range of 0 to 1 instead of 0 to 255. It can simply be achieved by dividing all elements in the array by 255 like this:
X_train = X_train/255
X_test = X_test/255
Next, what we need to do now is to reshape both X_train and X_test. Initially, if we check the shape of the two arrays, it’s going to be (60000, 28, 28) and (10000, 28, 28) respectively.
In fact, we need to reshape them all such that there will be a new axis which represents a single color channel as we are going to employ convolution layer (Conv2D layer — we’ll get into it later) in our VAE network. Therefore, we need to apply reshape() method to do so. Notice that I add number 1 (written in bold) at the end of each line.
# Convert from (no_of_data, 28, 28) to (no_of_data, 28, 28, 1)X_train_new = X_train.reshape(X_train.shape[0], X_train.shape[1], X_train.shape[2], 1)X_test_new = X_test.reshape(X_test.shape[0], X_test.shape[1], X_test.shape[2], 1)
After running the code above, the shape of our X data should now look like this:
That’s pretty much all of the preprocessing stage. In the next step we are going to construct the VAE architecture.
Credit: BecomingHuman By: Muhammad Ardi