Data – the Octane Accelerating Intelligent Connected Vehicles

The digital revolution is making a deep impact on the automotive industry, offering practically unlimited possibilities for more efficient, convenient, and safe driving and travel experiences in connected vehicles. This revolution is just beginning to accelerate – in fact, according to a recent Applied Market Research study, the global connected car market was valued at $63.03 billion in 2019, and is projected to reach $225.16 billion by 2027, registering a CAGR of 17.1% from 2020 to 2027.

As advanced use cases, like advanced driver assistance systems featuring lane change departure detection, advanced vehicle diagnostics, or predictive maintenance move forward, the existing infrastructure of the connected car is being stressed. Within the vehicle, current electronics and wiring infrastructures were not designed for this complex data wrangling capability. Adding more wires and throwing more compute hardware to the problem is simply not viable considering the cost and complexities of today’s connected cars or the additional demands designed into electric cars (like battery management systems and eco-trip planning). In addition, moving outside the vehicle, existing fragmented approaches for data management associated with the machine learning lifecycle are limiting the ability to deploy new use cases at scale.

Future connected vehicles will rely upon a complete data lifecycle approach to implement enterprise-level advanced analytics and machine learning enabling these advanced use cases that will ultimately lead to fully autonomous drive. To create a completely seamless data lifecycle, all components of the lifecycle must seamlessly integrate, from data ingestion and selection on a service-oriented gateway that has the speed, bandwidth, security, and connectivity to successfully address data pass-through to the cloud, to a machine learning platform with the scalability to ingest, process, and create/update machine learning models from widely ranging data sources, to finally deploying new or updated machine learning models to vehicles via OTA updates that exceed data fidelity and model accuracy expectations. A successful next-generation architecture must embody key characteristics including embedded intelligent edge computing, a secure and reliable embedded edge operating system, the ability to provide dynamic over-the-air updates, and an enterprise level advanced analytics and machine learning platform.

The vehicle-to-cloud solution driving advanced use cases

Airbiquity, Cloudera, NXP, Teraki, and Wind River teamed to collaborate on The Fusion Project whose objective is to define and provide an integrated solution from vehicle edge to cloud addressing the challenges associated with a fragmented machine learning data management lifecycle. The goal is to define, implement and offer a data lifecycle platform enabling and optimizing future connected and autonomous vehicle systems that would train connected vehicle AI/ML models faster with higher accuracy and delivering a lower cost. The state-of-the-art hardware, software, and cloud data analytics platform used for data collection, analysis, and OTA updates showcases continuous training and improvement of advanced use cases and autonomous driving functions for production vehicles. 

Phil Magney, founder and president of VSI Labs, and former Co-Founder of the Telematics Research Group said this about The Fusion Project, “Automakers are constantly challenged with implementing complex technologies such as those required for the next phase of advanced ADAS and autonomous vehicle features. There are many facets to a next-generation data management technology stack that continuously improves and deploys AI machine learning models, so automakers need a vehicle-to-cloud solution like the one created by The Fusion Project that leverages key technologies from across the automotive ecosystem.”

Intelligent vehicle lane change detection was the first of many planned use cases on this platform and it was chosen because it is the first step toward ADAS L2/L3 capability and ultimately fully L4 autonomous drive. The diagram below summarizes a dynamic machine learning life cycle in which the connected vehicles ML algorithms model accuracy is continuously improved through a fully integrated machine learning lifecycle.

The Fusion Project’s use case demonstration has shown connected vehicles ML models to have up to 99+% accuracy with up to 98% data reduction all the while delivering 10x faster ML model training time. Model accuracy is enabled by more accurate data collection and more accurate labeling and annotation, while the data reduction was achieved with a relevant selection of data for training and the ability to process and encode connected vehicle sensor data.  

Cloudera contributed to this connected vehicle machine learning lifecycle solution with its Cloudera Data Platform and the Cloudera Data Flow (CDF) and Cloudera Machine Learning (CML) experiences. Cloudera Data Flow is key to collect and stream the connected vehicle’s intelligent edge data into the cloud addressing the challenges of data in motion with the ability to scale processing and analysis of massive streams of data, ingestion of both structured and unstructured data from multiple sources and the ability to process and harness the value of such high-volume, high-speed data.  

CML creates, updates, and manages the connected vehicles machine learning models facilitating secure and fast ML workflows, has the power and scale AI use cases everywhere, provides self-service compute, IDEs, libraries, and frameworks, and can deliver models across hybrid-clouds. Continuous optimization delivers model and prediction accuracy monitoring, ground-truthing, model governance, lineage tracking, and model cataloging.

In summary, this is an exciting time. Micheal Ger, Managing Director Manufacturing & Automotive at Cloudera summed up it best with the insight,

“Imparting intelligence into connected cars is complex – involving hardware, software, and deep domain expertise. Cloudera is proud to provide the underlying data management fabric to the solution – everything from reliably moving connected vehicle data to the Cloud, to providing large scale data storage, processing, analytics and machine learning – the foundations of real-time insights and in-vehicle decision making.” 

Schedule a demo of this technology at The Fusion Project or learn more about Cloudera’s Connected Manufacturing and Vehicle solutions. In addition, join us for industry 4.0- Made Real where Cloudera, Accenture, Dell, Intel, and Microsoft will engage to deliver frank conversations in solving Industry 4.0 challenges. You will hear from industry-leading experts in a panel-style discussion on topics that challenge the success of Industry 4.0 scalability, ROI, and success. This author is passionate about industry 4.0, connected manufacturing, and connected vehicles, see more of his perspective at

Leave a comment

Your email address will not be published. Links are not permitted in comments.