Notes from Practical Deep Learning for Coders 2019 Lesson 1 (Part 1), and my attempt at classifying aircraft images
Other lessons: Lesson 2 / Lesson 3 / Lesson 4 / Lesson 5 / Lesson 6 / Lesson 7
Quick links: Fast.ai course page / Lecture / Jupyter Notebooks
The lecture used the example of classifying 37 types of cats and dog breeds: a fine-grained classification task.
The take-home challenge for this lesson is to follow the same approach to classify our own set of images. I decided to go with aircrafts, starting simple with just 3 broad categories: space shuttles (reusable), rockets (non-reusable) and airplanes.
I’ve set up a VM instance with GPU through Google Cloud Platform. I can therefore run the course’s Jupyter notebook against this instance, which makes it much faster and easier to train models.
Getting and preparing the data
There’s a very useful guide from the course, Creating your own dataset from Google Images, which walks you through how to obtain images from Google and quickly turn them into data for your model:
First, search for a type of image on images.google.com. You can specify the kinds of images and eliminate the types you don’t want (in my case, I wanted to limit everything to photos). I searched for “space shuttle”, “rocket” and “airplane.
For each image type, scroll to the very bottom of the results, and then open the javascript console in your browser with command+option+J on Mac (or Ctrl+Shift+J on Windows/Linux). Paste in the following command to download a list of image URLs:
urls = Array.from(document.querySelectorAll('.rg_di .rg_meta')).map(el=>JSON.parse(el.textContent).ou);
window.open('data:text/csv;charset=utf-8,' + escape(urls.join('n')));
I named my data files ‘urls_space_shuttle.txt’, ‘urls_rocket.txt’, and ‘urls_airplane.txt’.
I then created a folder for my data, aircrafts/ inside tutorials/fastai/course-v3/nbs/dl1/data. Inside this data/aircrafts/ folder, I also created the airplane/, space_shuttle/, and rocket/ folders.
Next, put the data files containing image URLs inside the aircrafts/ folder by uploading them via the Jupyter notebook UI:
We set the path variable to point to our data. We run the following for each data type:
path = Path('data/aircrafts')
dest = path/folder
dest.mkdir(parents=True, exist_ok=True)
The next step is to actually download the data. Fast.ai has a function, download_images(), that lets you specify the filename containing the URLs and the destination folder for the images. You can also specify the maximum number of images to be downloaded. We’ll start by limiting that number to 200.
First, specify the classes of images:
classes = ['space_shuttle','rocket','airplane']
Download all the flying machines!
download_images(path/file, dest, max_pics=200)
The notebook also talks about how to easily remove images that can’t be opened.
Now, a crucial step: Creating an ImageDataBunch object using our data from a folder with the function ImageDataBunch.from_folder():
data = ImageDataBunch.from_folder(path, train=".", valid_pct=0.2,
ds_tfms=get_transforms(), size=224, num_workers=4).normalize(imagenet_stats)
We pass in a path as the first parameter. The size of images used in tasks like these is usually 224 x 224. One important reason for this standardization is that GPUs have to apply the exact same instructions on the same type of data in order to be efficient. Normalization is also applied to keep the same mean and standard deviation, which can also help with model training.
The ImageDataBunch will create training, validation and test data.
I can now visualize some of the labeled data with data.show_batch(rows=3, figsize=(7,8))
Clearly, some of the images from the set are not the most informative. But it’s a start.
Training the model
We’ll use a convolutional neural network with a single hidden layer as a classifier. The CNN will take images as input and output the predicted probability for each of the 3 categories. In this case, we will have 3 outputs (classes are space shuttle, rocket and airplane).
To start, we’ll train for 4 epochs — run 4 cycles through the data.
We pass in the data (and ImageDataBunch), the model we want (in this case, resnet34) and the metrics (how it will be measured; what gets printed out while training). Here, the metrics will be the error rate.
learn = create_cnn(data, models.resnet34, metrics=error_rate)
This will download the resnet34 model with pre-trained weights. This is an existing model that has already been trained for the particular task of image classification, on 1.5M pictures of all kinds of things (thousands of categories, though not necessarily all aircrafts). Therefore, it has some idea about recognizing images already, and we’re just fitting it to our task.
This practice also lets us train models way faster, with less data (1000x optimization). To avoid overfitting, we’ll use a validation set that the ImageDataBunch created.
Kick off the training with fit_one_cycle():
learn.fit_one_cycle(4)
The parameter tells fit_one_cycle() how many times to cycle through the dataset.
Results
After 4 cycles (or epochs), we get an error rate of 0.1565 or ~16%, a.k.a about 84% accuracy
Let’s use ClassificationInterpretation to interpret our model:
interp = ClassificationInterpretation.from_learner(learn) interp.plot_top_losses(9, figsize=(15,11))
plot_top_losses() will give you the instances where the model was the most wrong. In other words, when the model was confident about a wrong answer.
most_confused() is another informative metric. It tells you which classes the model was “confused by” most often:
We can save our model with
learn.save('stage-1')
Unfreezing, fine tuning and learning rates
Now, we “unfreeze” the model that has gone through a session of training in order to fine tune it.
learn.unfreeze()
Two aspects that can influence the accuracy of our model are learning rate and epochs.
The learning rate is how quickly a model can update its parameters during training. Epochs is how many times the model cycles through the data.
We can run the learning rate finder to figure out the fastest rate the model can be trained before it starts to perform poorly.
learn.lr_find()
Next, we plot the learning rate against loss with learn.recorder.plot()
This graph shows that once the learning rate goes past 1e-03, the loss of my model goes all the way up. But in fit_one_cycle(), the learning rate defaults to 0.003.
We can train again with a new learning rate, passing in a range:
learn.fit_one_cycle(2, max_lr=slice(3e-5,3e-3))
The slice() parameter specifies what rate to train the first and last layers with (3e-5 and 3e-3, respectively). So we train the same model with 2 more epochs, this time passing in a set learning rate:
Notice that the error rate is now 13%, lower than the previous 16%
If we want to try to make the model better, we can tweak the following:
- Tune the number of epochs
- Keep tuning the learning rate
- Tune the batch size
- Try more layers, i.e. using a resnet50 instead of 34
More ideas to emerge as we progress through the next lectures!
Credit: BecomingHuman By: Julia Wu