A Closer Look at Tesla's Nvidia-powered Supercomputer for Training Deep Neural Networks
【Summary】At the fourth annual Computer Vision and Pattern Recognition (CVPR 2021) conference this week, Tesla’s senior director of AI, Andrej Karpathy, shared details the company’s latest supercomputer that will be used to train deep neural networks (DNN) for Tesla’s Autopilot and Full Self-Driving (FSD) autonomous driving features.
At the fourth annual Computer Vision and Pattern Recognition (CVPR 2021) conference this week, Tesla's senior director of AI, Andrej Karpathy, shared details the company's latest supercomputer that will be used to train deep neural networks (DNN) for Tesla's Autopilot and Full Self-Driving (FSD) autonomous driving features.
Tesla said its new supercomputer will be ready by the end of the year. It's actually the predecessor to Tesla's more powerful supercomputer nicknamed "Dojo", which Tesla Chief Executive Elon Musk says should be ready by the end of the year.
"Project Dojo" was first announced at Tesla's Autonomy Investor Day in April 2019. Musk mentioned the supercomputing power of "Dojo" will help Tesla better label visual data, which is a difficult and time consuming task for developers of self-driving vehicles.
For autonomous vehicles, neural networks are used to train software for complex tasks, such as identifying street signs, detecting pedestrians and predicting their movements, as well as for safe navigation. A typical self-driving vehicle can use dozens of DNNs for perception, localization and path planning.
However training neural networks requires a hefty amount of processing power, which is why Tesla built its supercomputer using powerful Nvidia GPUs.
Tesla's supercomputer uses 720 nodes of 8x NVIDIA A100 Tensor Core GPUs (5,760 GPUs total) to achieve an unparalleled 1.8 exaflops of performance, making it one of the world's most powerful computers. This kind of processing power is mind boggling. One exaFLOP is one quintillion (1018) floating-point operations per second.
NVIDIA's A100 GPUs power the world's highest-performing data centers. The A100 GPU provides up to 20x higher performance over the prior generation, according to NVIDIA.
"This is a really incredible supercomputer," Karpathy said during his presentation at CVRP 2021. "I actually believe that in terms of flops, this is roughly the No. 5 supercomputer in the world."
Tesla uses the data from more than one million Tesla vehicles on the road to refine and build new autonomous driving features for continuous improvement. Having a fleet of connected vehicles to regularly collect data from gives Tesla an enormous advantage over other automakers in the development of autonomous driving technology.
Camera data from Tesla vehicles is continuously fed into the supercomputer in order to improve the software powering Autopilot and FSD. The neural networks are used to label 4D data from videos taken from eight onboard cameras that make up each vehicle's 360-degree perception system.
How Tesla's Supercomputer is Put to Work
In a blog post, Nivida's Senior Director of Automotive, Danny Shapiro, provided an overview of how Tesla's supercomputer is used to train its deep neural networks for autonomous driving. Tesla's cyclical development actually begins in the car, he said.
Whenever a Tesla vehicle is driving, a deep neural network running in the background known as "shadow mode" quietly perceives and makes predictions without actually controlling the vehicle, explained Shapiro.
The predictions are recorded, and any mistakes or misidentifications are logged. Tesla engineers then take this information and use each instance to create a training dataset of difficult and diverse scenarios to refine the DNN for improved performance.
The dataset is a collection of roughly one million 10-second clips recorded at 36 frames per second, equal to around 1.5 petabytes of data. The DNN runs through these scenarios in the data center over and over until it operates without a mistake. From there, it's sent back to the vehicle and the process repeats.
Karpathy said training a DNN in this manner and on such a large amount of data requires a massive amount of compute power, which led Tesla to build and deploy the current generation supercomputer with NVIDIA's high-performance A100 GPUs.
In addition to training DNNs, Tesla's supercomputer provides vehicle engineers with the high performance needed to experiment and iterate in the development process.
Karpathy said the current DNN structure that Tesla is deploying allows a team of 20 engineers to work on a single network at once, isolating different features for parallel development, which is much faster.
These DNNs can then be run through training datasets at speeds faster than what has been previously possible for rapid iteration.
"Computer vision is the bread and butter of what we do and enables Autopilot," said Karpathy. "For that to work, you need to train a massive neural network and experiment a lot."That's why we've invested a lot into the compute."
Tesla's autonomous driving system uses just radar and cameras, while most other developers of self-driving vehicles rely on supplemental lidar data. However, as Tesla's computer vision capabilities improve, the radar will no longer be needed. Tesla said it would drop radar altogether, and began transitioning solely to the camera-based system in its Model 3 and Model Y vehicles starting in May.
Earlier this year, Musk said during the company's fourth quarter earnings call in January that its upcoming Dojo supercomputer could potentially be offered as a service for other companies for training their neural networks.
"So some of the others need neural net training, we're not trying to keep it to ourselves," Musk said in January. "So I think there could be a whole line of business in and of itself."
Tesla's supercomputer is also a work in progress and the company plans to build an even more powerful one than the upcoming Dojo in the future. Although Karpathy declined to elaborate on the next iteration of Dojo, he said it will take Tesla's supercomputing plans to the "next level", which might get Musk closer to his target of level-5 autonomous driving, which requires no human supervision whatsoever.
Originally hailing from New Jersey, Eric is a automotive & technology reporter covering the high-tech industry here in Silicon Valley. He has over 15 years of automotive experience and a bachelors degree in computer science. These skills, combined with technical writing and news reporting, allows him to fully understand and identify new and innovative technologies in the auto industry and beyond. He has worked at Uber on self-driving cars and as a technical writer, helping people to understand and work with technology.
General Motors to Install 40,000 Level-2 EV Charging Stations in North America as Part of its New ‘Dealer Community Charging Program’
Brembo Unveils a Breakthrough Electro-Mechanical Brake System Called ‘SENSIFY’ That’s Powered By AI
Tesla Closing in on a Trillion Dollar Market Cap After Rental Car Company Hertz Orders 100,000 Model 3 Sedans
General Motors is Quickly Reducing its Backlog of Parked Pickup Trucks Due to Semiconductor Shortages
Polestar is Offering Two Years of Free DC-Fast Charging From Electrify America to Polestar 2 Drivers
Nissan, Verizon Complete a Successful 5G-Powered Connected Vehicle Proof-of-Concept to Warn Drivers of Hazards Outside Their Line-of-Sight
As One of the World's Richest Persons, Can Jeff Bezos Help Make Amazon-backed Rivian Bigger Than Tesla?
XPeng’s Urban Air Mobility Affiliate HT Aero to Raise $500+ Million in Series A Funding Round
- Toyota Debuts its New Driver Companion & AI Virtual Assistant That's Powered By Google Cloud
- Ford Motor Co and Partners are Opening a ‘Smart Parking Lab’ in a Detroit Parking Garage to Test Automated Parking, EV Charging
- Ford Motor Co, Argo AI & Walmart to Launch an Autonomous Delivery Service in Three U.S. Cities
- Hyundai's Genesis Brand Planning To Go All-Electric by 2025
- Waymo is Now Picking Up Riders in its Self-Driving Robotaxis in San Francisco
- HAAS Alert Raises $5 Million to Expand its Cellular Emergency Vehicle Alert Network in the U.S.
- Rivian R1S, R1T Rated at Over 300 Miles by EPA
- Here’s Why Tesla Should Be Worried About the New Lucid Air Dream Editions
- The Jeep Wrangler 4xe Plug-in Hybrid Is Smart and Reasonably Quiet, But Falls Short on Range
- Tesla Model 3, Model Y Get Price Increases