Image to Image Translation is currently one of the hot topics of Deep Learning as they give the power to convert one image to another using the basis of the primary image we want to convert to.
As we can take an example of conversion of Horse2Zebra, apple2orange and basically other conversions too.
Now basically if we deep dive into the basics then we know that for doing such kind of conversion we need to tell the Machine how to convert the images on the basis of Data provided. So we will use my own Dataset that I have prepared by converting each colored image to a grayscale one
Trending AI Articles:
1. Deep Learning Book Notes, Chapter 1
2. Deep Learning Book Notes, Chapter 2
3. Machines Demonstrate Self-Awareness
4. Visual Music & Machine Learning Workshop for Kids
Now as we know images consist of pixels and for the conversion, we wish to achieve can be done only by using an algorithm which can teach our conversion model. So we will use Machine Learning to achieve this goal. So we will go right from the basics of algorithms we will intend to use in the final model of conversion. In this article, I will be going through basics of Neural Networks (a Deep Learning Architecture), Convolutional Neural Network (Considered for Image), and then through basics of Transfer Learning and its Importance and then finally through our final Model. We will be using Keras for Neural Network and CNN and finally implementing the model using both Keras and FastAI using GANs (Generative Adversarial Networks).
So if want to understand what does a Neural Network do we will first go through its basic flow diagram : (I will try to go as deep we can for making the basics stronger though I will not be going into the maths of Machine Learning but will explain the whole architecture and its working)
So basically a Neural Network consists of an Input Layer, series of hidden Layers and finally an output layer.
Consider these Neural Networks as an analogy for the brain. So as you can see the input layer, a middle layer, and an output layer. Some maths goes inside this architecture and you get a magical output that is near to equal to the actual prediction that you desire to obtain. Now how does this magic happen?
MAGIC OF NEURAL NETS:
The bubble as you can see in the above image FIG 1 is called perceptrons in Neural Networks and they basically are numbers in the neural flow (you can consider). So as you can see each perceptron is connected to the perceptrons of another layer and as followed further. A Perceptron is a type of artificial neuron which takes in several binary inputs x1, x2, … , xn and produces a single binary output. The layers apart from Input and the output layer are called Hidden Layers, the number of the hidden layers can be very large.
So what does these perceptron do so that the whole model generates close to the actual value? Let’s understand with an example :
As you can see we input images of Dogs and cats (a standard classification problem of Deep Learning), its fed to the input layer of the Neural Network and something flows forward and predicts the output. So this is basically Feedforward in Neural Networks.
Let’s get into the Maths behind the Neural Nets :
As I had earlier about the FeedForward term and the connection of perceptrons within layers on Neural Networks. The path or the connection between perceptrons have weights (Consider weight is the strength of the connection. If I increase the input then how much influence does it have on the output). So basically weights give you the importance of that input feature.
To understand the math behind Neural Networks, I urge you to go through the chain rule of Differentiation
So here we have a single layer network. Ok so here we have n inputs so accordingly we have weights accordingly. What goes into the transfer function is w1*x1+w2*x2+……+wn*xn+(bias). So now my transfer function contains the above expression. Now since this is linear activation we provide non — linearity to the output (I will explain the need of nonlinearity for the output)
Random Weights Initialisation :
As I said about the linear expression, did we know the values of weights initially? NO.
So we can just initialize it with random values and then after optimize it according to the desired output. Now we know what weights do. But what about the Bias term?
Weight increases the steepness of activation function. This means weight decide how fast the activation function will trigger whereas bias is used to delay the triggering of the activation function.
A simpler way to understand what the bias is: it is somehow similar to the constant b of a linear function y = ax + b
It allows you to move the line up and down to fit the prediction with the data better. Without b the line always goes through the origin (0, 0) and you may get a poorer fit.
BAD OUTPUT !!! HOW TO IMPROVE IT ??
Now as i said previously that after random initialization and feedforwarding it through the Neural Architecture we get an output. Do you think this output will be our desired value ?? Offcourse NO.So how will we improve it? That’s what is called teaching a machine how to do a particular work.
Minimizing the Error of the output :
The output we get after feedforwarding will be very inaccurate and needs to be improved by decreasing the error and for minimizing the error we have Calculus (Differentiation!!!). So in what respect will we improve. We had our weights randomly initialized and this can be improved to the exact value by perfectly setting up the weights. So our differentiation will be in terms of weights. This is called backpropagation or simply chain rule.
For error we have at the output we can compute it with various different ways. But the most commonly used is MSE (mean squared error loss )
After each loop of backpropagation, we have a change in weights and bias values.
where eta is the learning rate that we can set.
I have not much explained the backprop part as it might get tougher for the increasing layers and difficult to visualize but I will provide links where you can try to understand the part. I more of intend to give an intuition of what is going on behind these things.
So Now that we have understood the basics we will head forward to the coding part. I will be using Keras (Deep Learning framework) and we will do the MNIST handwritten digit prediction.
So let’s dive into some coding, I will be using Google Colab for this. You may use your own system as well.
The MNIST data is one of the most standard data in Deep Learning.
Step 1: Import the required libraries
from keras import *
from keras.datasets import mnist
from keras.layers import Input,Dense
from keras.models import Model
Step 2: Loading the Data (MNIST dataset is already in the keras library under the keras.datasets)
(x_train,y_train),(x_test,y_test) = mnist.load_data()
Step 3: See the data:
Step 4: Normalize the Data and reshape it according to the input layer:
# rehaping accordingly for the input layer
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
num_classes = 10
# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
Step 5: Making of the Magical Neural Networks
#Now we define a input variable :
inputs = Input(shape = (784,)) #now here we have the 28*28 image in the data therefore we take an input layer of 784 perceptrons
x = Dense(64,activation = 'relu')(inputs)
x = Dense(64,activation = 'relu')(x) #These Dense Layers are the middle layers of the Neural network
predictions = Dense(10,activation = 'softmax')(x) #The last layer contains 10 perceptron since we have 10 numbers to be predicted.
model = Model(inputs = inputs,outputs = predictions)
model.compile(optimizer = 'rmsprop',
loss = 'categorical_crossentropy',
metrics = ['accuracy'])
model.fit(x_train,y_train,epochs = 20,verbose = 1,validation_data = (x_test,y_test))
The above code will set you to train the model after you have trained the model you can change parameters like epochs, the optimizers.Add Dense Layers. All the documentation is available on :
To get the scores you can just do
score = model.evaluate(x_test,y_test,verbose=0)
CONGRATS FOR BUILDING YOUR FIRST DEEP LEARNING MODEL !!!!
In the next article I will be talking about the internal of this model, how we change the different parameters or in other words the Hyperparameters of the model.
Some important links :
Your Work: Similarly to this model make your own classifier for dogs and cats using keras.