8 takeaways from mobileye's End-To-End talk

deep learning Oct 21, 2024

Earlier this month, mobileye held a talk sharing their thoughts on End-To-End Learning (including Tesla's E2E system), and why they weren't going to use fully E2E in their own self-driving cars.

Below are 8 ideas I got from the first part:

Idea 1: MTBF: Mean Time Between Failures

Basically, how many hours between 2 interventions? What mobileye says is that E2E adds uncertainty to that time, while their Compound AI System (CAIS - AI modules and bricks) doesn’t. The typical range is between 50k hours and 100M hours, so this is a lot.

Mobileye believes Waymo is the only one who passed the required levels; and therefore can really be « eyes off - hands off ».

I'll get back to this point with idea 7 to illustrate it with Tesla.

Idea 2: There are 3 typical self-driving prototypes: Waymo, Tesla, and Mobileye.

  • Unlike Tesla, Mobileye says it’s not camera only, but camera centric (they double down on camera, but use other sensors)
  • Unlike Waymo, they claim to use HD Maps, but have a real scalable system to build them (called REM — I actually describes it in one of my daily emails here)
  • Unlike Tesla, it’s not full E2E, but a CAIS (Compound AI System). AI is a part of the solution but not the full solution.
  • Finally comes the MTBF, which is unknown for Tesla, and to be developed for Mobileye

Idea 3: E2E give 2 main promisesno « glue » code and fully unsupervised learning. 

This is the idea behind End-To-End: input > model > output. I actually describe it in this post.

End-To-End principle (source)

According to mobileye, the reality is different: there is code to glue, and tons of problems.

Idea 4: The AV Alignment problem

There is still glue to put.

For example, people don't fully stop at STOP signs. So you can't use your data alone with no glue. You need to add glue. In E2E, the glue is not "online" (the algorithms running live), but "offline" (the data is altered, the training losses changes penalty, etc...)

Idea 5: The Calculator problem

Chat-GPT can calculate 2x2, but if you start asking for long multiplications with many numbers, it'll get confused. So much that OpenAI shifted to a calculator solution, where they replace LLMs with python code when they identify a query for complex calculations. 

(even in the picture below taken today with a python script, you can see errors!)

undefined
When asked complex reasoning, Chat-GPT gets out of the LLM world and goes into Python code

Idea 6: The Shortcut Learning Problem

Because camera, LiDAR, and RADAR are all different, just fusing them all in a big network (the E2E approach) is stupid because it doesn't leverage their unique advantages (RADAR for speed, camera for context, etc...).

Example of the shortcut learning problem and why fusion isn't that straightforward (source)

Idea 7: The Long Tail Problem

Solving one edge case doesn't guarantee your system will get better. It depends on how rare the edge case is, and how related it is to your driving behavior. 

The (publicly available) Tesla FSD Tracker Data shows that while there has been a significant improvement from FSD 11 (Modular) to FSD 12 (E2E), FSD then decreased from 12.3 to 12.5. So it shows it's still gradual improvements versus a revolution.

undefined
FSD 12 may be better than 11, but FSD 12.5 isn't better than FSD 12.3

Idea 8: Primary Guardian Fallback

In Sensor Fusion, what do you do if 2 sensors tell a different thing?

Take a lane detection approach, where we try to classify the type of lane we're in...

How do we do this lane detection approach? You can use a lane detector, or you can project the lanes from your HD Map using accurate localization and mapping.

What if these 2 systems tell a different lane type? Mobileye has the PGF — Primary Guardian Fallback system:

  • Primary: Your lane detector output
  • Fallback: Your HD Map output
  • Guardian: A "guardian" E2E Network output

So basically, redundancy!

If the guardian agrees with the primary, take primary. Otherwise take fallback. Mobileye proved that this significantly helped reduce the overall error.

And that is it!

Next Steps

I wrote 2 good articles about End-To-End Learning, you can learn more here:

Tags

Mindmap

Interested in Autonomous Systems? Download the Self-Driving Car Engineer Mindmap

The Self-Driving Car Engineer Mindmap is a video + PDF mindmap showing you the main areas of self-driving cars, and giving you a path to build a career as a self-driving car engineer.