Perception, Prediction, Planning & Control
This perception, prediction, planning & control are vital to enabling autonomous driving.
Perception is based on sensors inputs and uses this to represent the scene based on objects and scene semantics. The different dimensions of perception are the objects (type, visibility, language), environments (time of day, seasons, road conditions) and scene configurations (Driving conventions and rules). Prediction utilises the perception together with past behaviour, high-level scene-semantics, object attributes, appearance cues and behaviour of other agents (Other vehicles, pedestrian, animals) to generate a plan. The plan is used to control the vehicle behaviour and ensure its safe, comfortable, send right signals to other traffic participants and makes progress on the route. Usually, when an autonomous vehicle is not confident of its perception, it reverts to driving more conservatively.
The various companies working on the challenge of autonomous vehicles have taken different approaches. These companies fall into two distinct camps. The ones relying mainly on Computer Vision and others rely on HD mapping. Tesla utilises the former whereas Waymo utilises the later. Speaking about this recently Andrej Karpathy the head of AI and computer vision at Tesla mentioned that :
Waymo and many others in the industry use high-definition maps. You have to first drive some car that pre-maps the environment, you have to have lidar with centimeter-level accuracy, and you are on rails. You know exactly how you are going to turn in an intersection, you know exactly which traffic lights are relevant to you, you where they are positioned and everything. We do not make these assumptions. For us, every single intersection we come up to, we see it for the first time. Everything has to be sold — just like what a human would do in the same situation.
Even though this approach is relatively more complex and challenging to scale than the HD mapping based approach, as per Andrej, Tesla aims to scale the AI across millions of vehicles by:
Speaking of scalability, this is a much harder problem to solve, but when we do essentially solve this problem, there’s a possibility to beam this down to again millions of cars on the road. Whereas building out these lidar maps on the scale that we operate in with the sensing that it does require would be extremely expensive. And you can’t just build it, you have to maintain it and the change detection of this is extremely difficult.
So Tesla has taken a vision-based approach and thus needs to solve on the spot rather than relying on a pre-fed map. Whereas Waymo with its strategy of using Lidar and HD maps approach is very expensive as every single road has to be scanned and also there needs to be a considerable amount spent to maintain and keep these HD and the Lidar maps up to date. Thus this causes challenges with respect to cost-effectiveness and scalability. To overcome this challenge, Waymo is trying to build and utilise learning agent models from real-world demonstration to train its self-driving algorithm.