About 2 months ago, I started my journey in deep learning and participated in my first hackathon. My first submission landed me in the top 2% of all the participants. However, as more and more people got to know about the competition, my rank started to take a dip (I still managed to finished in the top 5% ;)). I wanted to improve upon my model, but at that time, I did not know how.
Well, now I do. Thanks to asking people and reading other people’s code, I have some tips for you that will help you in your own deep learning projects.
Before we get started, just wanted to quickly mention that it is okay to use other people’s code as long as you know what it does.
Trending AI Articles:
1. Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data
2. Data Science Simplified Part 1: Principles and Process
3. Getting Started with Building Realtime API Infrastructure
4. How I used machine learning as inspiration for physical paintings
Detailed explanation. Full notebook.
For my first submission I used an image resolution of 256. For data augmentation I turned off
max_warp. I used ResNet34 as my pretrained model because it was the only model I was aware of. I trained the head of the model for a few epochs, unfreezed the model, found the learning rate and trained some more. The final validation accuracy of this model was ~93%
There are 2 things we can improve upon here.
a) Accuracy and
b) Confusion between classes.
#1. Using better models (architecture)
In this approach, we follow the exact same steps that we used in the basic approach except this time, instead of only trying ResNet34, we also try ResNet50 and ResNet101.
ResNet34 is a really good pretrained model and gives pretty good results. However, to further improve upon the performance we are going to need more layers. These layers would catch intricacies that ResNet34 couldn’t and give slightly better results. Hence, we experiment with ResNet50 and ResNet101 as well.
NOTE: One trick that experts use in machine learning as well as deep learning is to take the average prediction from a lot of models as the final prediction for every row. This resulting model is quite robust.
#2. Progressive image resizing
In this approach, we will gradually increase the resolution of our image as we train our model.
Experiments have shown that if we train our model with lower resolution images and then use those weights to train our model with higher resolution images, our accuracy improves. And this is what we’ve done in our second tip. This tip is known as progressive image resizing and is highly effective. We start with
size = 128 and finish with
size = 256 .
#3. Reducing confusion by removing confusion and putting it all together
This is a little one I came up with.
Even after using better models and progressive image resizing, the confusion between some of the classes (mountains and glaciers) did not reduce. What I thought in this case was, why not let the model first understand what mountains are. We will show it glaciers later. So, I started with less number of classes to remove the confusion and then introduced them back and trained some more. I was surprised to find there were research papers on this approach. However, in this case it did not work pretty well. Deep learning eventually, is really a balance based on what we want to judge our model on. Do we need a better accuracy or do we need less confusion? Also since the classes in our case are a little ambiguous (street v/s building) it is difficult to achieve really good results.
When we plot top losses, we see what parts of images that our model is picking up. These parts can be wrong (picking up the background instead of the mountain) and hence cause misclassification. I would like to learn how to improve on this.
All the notebooks and .pth files can be found on my GitHub.
To conclude, Deep learning involves a lot of experimentation and unless you get your hands dirty, you are not going to get good at it. The best way to learn it is to try a lot of things. As Jeremy likes to say:
The answer to the question should I do bla, is always try bla and see
That would be it for this article. Check out some of my other articles for more deep learning projects.
As always, if you liked this article, give it at least 50 claps :p