This article was written by Johannes Rieke.
A very lightweight tutorial to object detection in images. We will bootstrap simple images and apply increasingly complex neural networks to them. In the end, the algorithm will be able to detect multiple objects of varying shapes and colors (image below). You should have a basic understanding of neural networks to follow along.
Image analysis is one of the most prominent fields in deep learning. Images are easy to generate and handle, and they are exactly the right type of data for machine learning: easy to understand for human beings, but difficult for computers. Not surprisingly, image analysis played a key role in the history of deep neural networks.
In this blog post, we’ll look at object detection — finding out which objects are in an image. For example, imagine a self-driving car that needs to detect other cars on the road. There are lots of complicated algorithms for object detection. They often require huge datasets, very deep convolutional networks and long training times. To make this tutorial easy to follow along, we’ll apply two simplifications: 1) We don’t use real photographs, but images with abstract geometric shapes. This allows us to bootstrap the image data and use simpler neural networks. 2) We predict a fixed number of objects in each image. This makes the entire algorithm a lot, lot easier (it’s actually surprisingly simple besides a few tricks). At the end of the post, I will outline how one can expand on this approach to detect many more objects in an image.
I tried to make this tutorial as simple as possible: I will go step by step, starting with detection of a single object. For each step, there’s a Jupyter notebook with the complete code in this github repo. You don’t need to download any additional dataset. The code is in Python plus keras, so the networks should be easy to understand even for beginners. Also, the networks I use are (mostly) very simple feedforward networks, so you can train them within minutes.
Table of contents
- Detecting a single object
- Detecting multiple objects
- Putting it all together: Shapes, Colors, and Convolutional Neural Networks
- Real-world objects
To read the whole article, with each point detailed and illustrated, click here.
Follow us: Twitter | Facebook