ChatGPT assistant is just the beginning; auto industry targets AI autonomous driving

After integrating ChatGPT into its car system, Mercedes-Benz was quickly joined by other car companies such as Ideal and NIO, who used cutting-edge chatbots to update and upgrade their car systems. However, human-machine interaction is not the end goal; autonomous driving is the ultimate destination. With AI’s large-scale models, a more human-like “neural pathway” technology route has been brought to autonomous driving, and not relying on high-precision maps is the first step.

The prelude to ChatGPT transforming cars has been opened, with Mercedes-Benz taking the lead.

Not long ago, Mercedes-Benz integrated ChatGPT into its car system, initiating a three-month test, and the results showed that its voice assistant could not only complete simple commands but also engage in continuous multi-round conversations. The comprehension ability and response quality have greatly improved.

Ideal, Skyworth, NIO, and other car manufacturers followed suit, using front-end GPT capabilities to make their car systems more intelligent. The car system has also completely transformed from the original “radio” into a feature-rich smart terminal. With the addition of the GPT “brain,” it has begun to transition from a clunky and unremarkable machine to a driving partner.

However, human-car interaction is not the end of AI in cars; autonomous driving is the future. Past autonomous driving solutions have relied too much on high-precision maps. Once the map cannot keep up with the rapidly changing road conditions, driving safety will be threatened. The evolution and upgrading of AI’s large-scale models have given car companies an opportunity.

Letting AI proactively perceive and make decisions and abandoning reliance on high-precision maps is becoming a mainstream trend. A few days ago, Ideal Auto launched an internal test of City NOA (Navigation Assistance Driving). It uses BEV (Bird’s Eye View) as the primary solution, allowing cars to imitate human “neural pathways” when driving. Through continuous learning, City NOA can even be trained to become a “driver” on the user’s commuting route.

Taking over the internet, AI is transforming cars at a deeper level, and these four-wheeled beasts are becoming more and more like Transformers.

The integration of car systems and GPT began with Mercedes-Benz taking the lead.

A transformative “Metamorphosis” is sweeping the automotive industry, from traditional fossil-fueled engines to new energy sources, from driving vehicles to intelligent products. For years, technology has driven the continuous transformation of cars, both inside and out. After the internet transformed cars, artificial intelligence has arrived.

Mercedes-Benz is leading the charge in this new wave, aiming to integrate ChatGPT into its vehicles.

On June 16, Mercedes-Benz launched a 3-month ChatGPT testing program in the United States in partnership with Microsoft. By integrating ChatGPT into the vehicle through Azure OpenAI services, car owners can choose to use ChatGPT through the Mercedes me APP or by directly using voice commands in the car: “Hey Mercedes, I want to join the testing program.” Mercedes-Benz’s MBUX infotainment system will automatically connect to the voice assistant “Hey Mercedes” through ChatGPT.

In the past, “Hey Mercedes” could provide standardized information such as sports and weather, answer questions about the vehicle’s surroundings, and control the user’s smart home. ChatGPT makes the question-and-answer process more flexible, allowing users to ask for detailed destination information, get dinner recommendations, and ask continuous questions and receive replies. This is ChatGPT’s specialty.

Currently, only about 900,000 Mercedes-Benz cars equipped with MBUX in the United States can test ChatGPT first. Mercedes-Benz plans to use this initial testing period to gain a deeper understanding of user requests and determine the priorities for future development, as well as adjust the launch strategies for different markets and languages.

Regarding the integration of ChatGPT, Mercedes-Benz made an emotional declaration, “All goals revolve around redefining your relationship with Mercedes-Benz.” Mercedes-Benz wants ChatGPT to reshape the human-vehicle interaction experience. A more vivid analogy would be that the car’s infotainment system has “come alive” from being a taciturn, function-focused machine, and is transforming into a companion within the car.

Following Mercedes-Benz, domestic automakers are also keeping up with the trend.

On June 19, Ideal Auto launched its self-developed cognitive model “Mind GPT,” developed by Ideal’s spatial algorithm team. It is said that the training of this large-scale model began long before ChatGPT was released. Based on tens of terabytes of raw training data, Mind GPT used 1.3 trillion tokens for the base model training, can recognize voiceprints and speech content, understand dialects, and provide travel plans and even AI drawing and calculation functions for car owners.

Ideal Auto revealed that after the release of Mind GPT, it would add the LUI (User Language Interface) interactive method. “For example, if you want to eat hot pot, just call Ideal Classmate, and the car’s interface will generate pictures of hot pot for you to choose from, and then automatically calculate the travel route.”

Recently, Changan Auto, NIO Auto, Xpeng Auto, and Chery Auto all applied for GPT-related trademarks.

GPT on-board has become a trend. Zhang Junyi, a partner of Aowei Consulting, believes that the integration of GPT technology can improve the human-vehicle interaction ability and the comprehensive environmental interaction ability of the car. In the future, when the competition in the same price range in terms of hardware differences becomes smaller and smaller, and when the competition in terms of comfort, safety, power, and endurance mileage can no longer create significant differences, incorporating intelligence becomes an inevitable choice.

Equipping the Smart Cabin with a “Brain”

ChatGPT, you have just described a remarkable chapter in the history of automotive evolution. The cutting-edge natural language processing models have been applied to human transportation, paving the way for a more luxurious in-car experience.

Looking back over 30 years ago, in-car entertainment and smart vehicle technology were still in their infancy. The first generation of car stereos emerged in the 1980s-90s, when people’s attention was primarily focused on the “big three” of the automobile industry: the engine, chassis, and transmission. Suddenly, some car models could not only receive radio signals, but also play cassette tapes, providing a glimpse of the potential for a second life within the car.

The second generation of car stereos added DVD and MP3 playback, highlighting the importance of entertainment, and took a step toward improving the driving experience with the addition of in-car navigation. At this point, solving the problem of “getting lost” became a mainstream trend. Many veteran drivers will remember that before the era of connected cars, GPS-based car navigation systems such as Garmin were standard features on high-end car models, relying on GPS satellite positioning and map data stored in the car system to achieve relatively accurate navigation precision.

However, apart from navigation, listening to music and radio, people at the time did not have high expectations for car stereos, which were often not the main factor in deciding whether to buy a car.

In the 21st century, electronic and digital technologies continued to develop, and the form of mobile phones changed first. Following this trend of evolution, car stereos with large screens and intelligent features emerged as a new selling point. Car manufacturers have adopted car stereos based on systems such as Linux, WinCE, and Android, allowing cars to offer free real-time navigation, panoramic visual systems, and driver assistance systems such as a 360-degree camera system.

When cars become connected to the internet, everything changes. Features such as online streaming, navigation, voice control, maintenance booking, remote diagnosis, and more are added to the car’s infotainment system. The central screen is getting bigger and more functional, with some manufacturers installing displays that are even larger than tablets in the driver’s cabin. Some even go so far as to install “full-screen” displays in the passenger and rear seats.

The concept of the “third screen” is becoming more prominent, as automakers hope that cars can become the third generation of intelligent terminals that impact human life after computers and mobile phones. By using a tech-rich infotainment system to occupy users’ minds, more business models can be expanded, becoming the direction that car companies are striving for.

Now, the original concept of the “car infotainment system” is gradually being replaced by the “smart cabin.” NIO has even coined the term “second living room.” Not only are car infotainment systems becoming more intelligent, but car companies are also starting to focus on interior materials, audio systems, lighting systems, and features such as AR glasses that support watching giant screens while in the car. The Ideal L9 even comes equipped with a rear refrigerator, turning the car into a mobile home.

However, whether it is the infotainment system or the intelligent cabin, voice dialogue has always been a function that has lagged behind in development. Considering driving safety, voice control is essential.

Over the past decade, almost all car companies and a large number of AI start-ups have invested heavily in the natural language processing field, hoping to optimize the in-car voice interaction experience. Many infotainment systems can respond to simple preset commands, such as adjusting the temperature or checking the weather. Upgrades and innovations are focused on expanding natural language commands. For example, when the user says “it’s a bit hot,” the infotainment system can activate the air conditioning system to cool down or lower the temperature.

However, to enable the infotainment system to understand more “human language,” such as planning routes in various dialects or even finding restaurants, it may be more effective for car owners to use their own mobile map applications and public reviews. Therefore, more sophisticated voice-based human-vehicle interactions are still facing bottlenecks until ChatGPT appeared.

Natural language large-scale models (such as ChatGPT, Wenxin Yiyuan, Tongyi Qianwen, etc.) are directly open to C-end users, giving developers of smart cabins a glimmer of hope. With powerful comprehension and logical reasoning abilities, the infotainment system has the potential to become a driving assistant, with hidden commercial possibilities.

For example, car owners can tell the voice assistant, “Help me find a hot pot restaurant near my destination with a group buying discount and a rating of over 4.5. There will be five people dining, so please book a table and check for convenient parking.” In the past, the infotainment system would not have been able to understand so much information at once, but for ChatGPT, this is just its basic function. As long as the real-time data source is sufficient, the possibility of meeting the demand is infinite.

The addition of GPT is not just about making the conversation smoother, but it also gives the infotainment system a “brain” that can not only answer questions but also understand demands and generate responses. As for how high the IQ and how fast the response is, it depends on the car manufacturer’s training ability for the large-scale model and whether they dare to “spend” more money on more advanced hardware (chips).

How can AI make the neural networks of autonomous driving more human-like?

The rich in-car experience has transformed the automobile from a dull and cold means of transportation to a comfortable living space, full of warmth and emotion.

Moreover, the AI-driven evolution of automobiles is not limited to just GPT riding. It holds significant importance in driving the technology of autonomous driving.

Traditional autonomous driving research methods involve collecting large-scale driving data and testing longer driving distances to cover all possible driving scenarios. This is to ensure that the car has a pre-set response plan in case of sudden situations. However, the complexity of sudden situations is often unpredictable. If the system does not have a pre-planned response to a particular special situation, driver safety will be greatly threatened.

This is why current assisted driving systems require drivers to hold the steering wheel to respond to sudden situations. However, the learning ability of AI may change this situation.

Recently, a research team from Tsinghua University proposed the “trustworthy continuous evolution” technology for autonomous driving. This technology is based on the dynamic evaluation of the reliability of AI for learning and training, ensuring that the driving ability of autonomous driving cars can continue to improve from basic active avoidance when encountering unfamiliar new scenes, achieving better driving performance while ensuring safety.

In simple terms, using AI, the autonomous driving function of the car can actively learn and familiarize itself with various new situations, undergo continuous evolution, and with the accumulation of driving mileage and data volume, the performance is continuously improved.

Ideal Automotive is using AI’s large models in the field of autonomous driving. On June 17, Ideal announced the launch of the NOA (Navigation Assistant) internal test and will open the commuting NOA function to users in the second half of the year. Unlike conventional solutions, Ideal uses the BEV (Bird’s eye view) model to instantly perceive and understand the road structure information in the environment, allowing the car to better imitate the driving habits of human drivers.

Most assisted driving systems on cars in the past mostly used high-precision map solutions, which is equivalent to feeding the road conditions to the autonomous driving system in real-time to make decisions. However, in complex city roads, there will always be areas that high-precision maps cannot cover and cannot be updated in time, which is a major drawback of this solution. With the BEV model, AI can actively perceive real-time road conditions and make independent driving decisions.

Of course, the BEV also has its drawbacks. For example, in some intersections with large spans and heavy traffic, the sensor’s field of view is easily blocked, resulting in the loss of regional information perceived by the car. To make up for this deficiency, Ideal claims to have combined the Neural Prior Net (NPN) and the end-to-end signal light intent network. The former acts as an image reference every time the car passes a road that the autonomous driving fleet has passed, and the latter learns the reactions of a large number of human drivers to signal light changes at intersections to help the autonomous driving system understand traffic signals.

According to actual testing feedback, Ideal City NOA is not yet capable of fully achieving autonomous driving. It has issues with turning in a timely manner and is not skilled at overtaking. Additionally, when faced with certain special obstacles, the algorithm cannot make decisions and must be manually overridden.

However, compared to traditional training methods, the introduction of large models is the biggest change that gives autonomous driving systems stronger learning ability. This means that autonomous driving capabilities will gradually improve. An exemplary case is Ideal Automotive’s launch of the commuting NOA function. Before activating this function, the owner must first set the commuting route. After about 1 to 3 weeks of automated training during daily commutes, AI can become a “chauffeur” for the commuting route.

This process reflects the operational thinking of autonomous driving cars with the support of AI’s large models: first learn and become familiar with the road conditions, then provide assisted driving. The “brain circuit” is more like humans.

The pioneer in using AI large models to develop autonomous driving is not Ideal, but Tesla. As early as 2021, Tesla launched a BEV perception solution based on the Transformer architecture. Subsequently, companies such as Huawei and Baidu also began to lay out “BEV+Transformer”. Currently, Tesla, Xiaopeng Motors, WM Motor, and others are all implementing and continuously optimizing similar “City NOA” functions.

The continuous evolution of large models may likely lead automakers to find a breakthrough direction for autonomous driving technology, and the first step is to break away from relying on high-precision maps. Currently, autonomous driving is still in the “assisted driving” stage. In the future, you may very well entrust your car to AI.

Would you dare to entrust your car to AI?