For this project, I will be using the PyTorch framework and a Pokémon Image Dataset on Kaggle. In this project, I will be creating a Deep Convolutional GAN to generate fake Pokémon images. A DCGAN is a type of GAN that excels at producing image content as it contains Convolutional Layers.
Training Parameters
To begin, we will need to define some inputs and parameters for setting the neural network architectures as well as training.
Input Data
Here is a sample of the Pokémon images that make up the dataset. For my model, I decided to downscale the images to 64 x 64 for an efficient process and data normalization. A DCGAN prefers a three-channel input (RGB); however, since the dataset had a transparent background, it had four channels (RGBA — A is referring to the alpha channel). It is recommended that you convert these to RGB images for improved performance; however, I lazily decided to skip this step 😁.
Generator
Now that we have our data, we can start off defining the architecture for the generator model. Given the latent vector as input, the generator outputs an image matching the images in the dataset 3 x 64 x 64. The model itself has five convolutional layers for making the image and its layers. Each convolutional layer is followed by batch normalization to improve the efficiency and stability of the network. Also, the model uses ReLU as its activation function (except for the final being a tan hyperbolic function). It is important to note that the network is initialized with a random set of weights and biases, resulting in noise.
Discriminator
The discriminator architecture is very similar to the generator as it has the same overall structure. The main difference is that the LeakyReLU and Sigmoid activation functions are used. Instead of suing convolutional layers that produce content, convolutional layers are used for classification. Similar to the generator, the discriminator is also initialized with a random set of weights and biases.
Loss Functions and Optimizers
The loss function used is the BCELoss function in PyTorch, which is a Binary Cross-Entropy Loss function. This loss function is optimal loss function for targets (output) between 0 (fake) and 1 (real), making it practical for our model.
For both networks, we will be using Adam Optimizers from PyTorch and is an adaptive learning rate optimization algorithm.
Training Time!
The time has come to put our GAN to work and initialize the training process. To do, we must define the generator and discriminator loss (this involves some math which I will skip for simplicity’s sake). Once we set up the training loop, we can just let the loop run for the number of epochs we defined earlier. I trained my model for 2500 epochs/iterations.
The Results Are In!
Well, the results were interesting, to say the least. I particular found it interesting how many of the fake Pokémon’s looked the same. But, I will let you be the judge of the quality of the fakes. I think it is pretty interesting, and I can see a couple of the fakes serving as inspiration for legitimate future Pokémon’s. The most interesting is the variation of Pokémons between various epochs or iterations of the GAN.
You can check out the Google Colab Interactive Notebook to train the GAN yourself and make some tweaks!
Credit: BecomingHuman By: Jatin Mehta