As you can see, Computer Vision is requiring a lot of Deep Learning for the task of detection.
In Lane Line Detection and Segmentation, we use Deep Learning over traditional techniques because they’re faster and more efficient. Algorithms such as LaneNet are quite popular in the field of research to extract lane lines. 👉 To learn more about it, you can also check my article on Computer Vision.
2D Object Detection is also at the heart of Perception. Algorithms such as YOLO or SSD have been very well explained and are very popular in this field. They’re constantly updated, replaced with new ones, but the idea is similar.
👉 If you’d like to learn more, here’s my research review on YOLOv4.
Finally, many camera setups are made in Stereo.
Having Stereo information helps us build what’s called a Pseudo-LiDAR. We can completely emulate and even sometimes replace the LiDAR, and therefore do 3D Perception with cameras (2D sensors).
For that, we used to implement Block Matching using traditional Computer Vision… and we’re now switching to Deep Learning.
👉 My article on Pseudo-LiDARs and my course on 3D Computer Vision can help you understand more.
First, traditional approaches based on the RANSAC algorithm, 3D Clustering, KD-Trees and other unsupervised learning techniques are still the go-to for many robotics applications.
👉 I teach these in my Point Cloud Fast Course if you’re interested.
However, these aim to be replaced by Deep Learning approaches, faster and safer. Why? Because naïve algorithms can’t classify, or tell that two very close people are indeed two people. A learning approach is more suited here.
Many object detection in 3D papers have been released by companies such as Apple (VoxelNet), UBER ATG (PIXOR and Fast and Furious), nuTonomy (PointPillars), and the University of Oxford (RANDLANET).
At the heart of it, we find techniques such as 3D CNNs (Convolutional Neural Networks) or PointNet. These are the fundamentals of 3D Deep Learning.
These days, LiDAR detection using Deep Neural Networks is booming. That’s one of the most active area of research in self-driving cars.
RADAR is a very mature sensor. It’s over 100 years old, and it’s no shame to say it doesn’t need Deep Learning to be efficient. Using RADARs, we’ve been able to measure obstacles speed for decades. In fact, if you’ve got a speed ticket lately, it’s because of RADARs.
👉 The techniques used for this are well explained in my dedicated article.
Deep Learning in RADARs is starting to emerge with algorithms such as Centric 3D Obstacle Detection or RADAR Region Proposal Network. However, it seems to still be early research.
The final part of Perception is Sensor Fusion.
To make the detection “sure”, we include what’s called redundancy. The idea is simple: we merge data from sensors and check if they tell the same thing.
There are 3 ways to merge for a company using all 3 sensors:
- Merging Camera and LiDAR
- Merging Camera and RADAR
- Merging LiDAR and RADAR
Here’s a map that shows you all the ways we use Deep Learning in Sensor Fusion.