When we first saw this, we couldn’t believe what we were watching. We have come such a long way from our original image! And to think that the approach taken to achieve such results was such a silly one.
Since each of the previous two models were no good individually, but they clearly were good at getting different things done, what if we combined the two of them?
So we ran the original image through Deep Image Prior, and subsequently fed the results of that through the Decrappify model, and voila!
Relative to the original image, the colors of the current image look incredibly realistic. The lucid demarcations of the crop fields will certainly come a long way in helping us label our data.
The way we pulled this off was embarrassingly simple. We used Deep Image Prior which is found at its official Github repository. As for Decrappify, given our objectives, we figured that training it on satellite images would definitely help out. Having the two models readily set up, its just a matter of feeding images into them one after the other.
A Quick Look at the Models
For those of you that have made it this far and are curious about what the models actually are, here’s a brief overview of them.
Deep Image Prior
This method hardly conforms to conventional deep learning-based super-resolution approaches.
Typically, we would create a dataset of low and super-resolution image pairs, following which we train a model to map a low-resolution image to its high-resolution counterpart to increase crop cultivation. However, this particular model does none of the above, and as a result, does not have to be pre-trained prior to inference time. Instead, a randomly initialized deep neural network is trained on one particular image. That image could be one of your favorite sports stars, a picture of your pet, a painting that you like, or even random noise. Its task is then, to optimize its parameters to map the input image to the image that we are trying to super-resolve. In other words, we are training our network to overfit to our low-resolution image.
Why does this make sense?
It turns out that the structure of deep networks imposes a ‘naturalness prior’ over the generated image. Quite simply, this means that when overfitting/memorizing an image, deep networks prefer to learn the natural/smoother concepts first before moving on to the unnatural ones. That is to say that the convolutional neural network (CNN) will first ‘identify’ the colors that form shapes in various parts of the image and then proceed to materialize various textures in the image. As the optimization process goes on, CNN will latch on to finer details.
When generating an image, neural networks prefer natural-looking images as opposed to pixelated ones. Thus, we start the optimization process and allow it to continue to the point where it has captured most of the relevant details but has not learned any of the pixelations and noise. For super-resolution, we train it to a point such that the resulting image it creates closely resembles the original image when they are both downsampled. There exist multiple super-resolution images that could have produced each low-resolution image to increase crop cultivation.
And as it turns out, the most plausible image is also the one that doesn’t appear to be highly pixelated, this is because the structure of deep networks imposes a ‘naturalness prior’ on generated images.
We highly recommend this talk by Dmitry Ulyanov (who was the main author of the Deep Image Prior paper) to understand the above concepts in depth. Below you can follow the super-resolution process of Deep Image Prior.