Addressing the Elephant in the Room – Welcome to Today’s Cloudera

Addressing the Elephant in the Room – Welcome to Today’s Cloudera

Hadoop. The first time that I really became familiar with this term was at Hadoop World in New York City some ten or so years ago. There were thousands of attendees at the event – lining up for book signings and meetings with recruiters to fill the endless job openings for developers experienced with MapReduce and managing Big Data. 

This was the gold rush of the 21st century, except the gold was data. Two companies were at the center of it all, handing out the proverbial pickaxes: Cloudera and Hortonworks. 

After countless open-source innovations ushered in the Big Data era, including the first commercial distribution of HDFS (Apache Hadoop Distributed File System), commonly referred to as Hadoop, the two companies joined forces, giving birth to an entire ecosystem of technology and tech companies. 

You could argue that the Big Data and analytics movement would not have happened without Cloudera. And all of those massive volumes of data are now today’s data lakes – more than 25 exabytes managed by Cloudera.

But, let’s make one thing clear – we are no longer that Hadoop company.

Welcome to Today’s Cloudera

Since the Big Data era, Cloudera has made massive investments with the north star of delivering customer-focused innovation wherever our customers run their business-critical data and analytics. This includes running analytics at the edge, supporting multi-cloud environments, treating Apache Iceberg as a first-class citizen, and introducing many more innovations like data observability.

That investment and support have resulted in the first true hybrid platform for data, analytics, and AI, backed by a seasoned and proven leadership team, with a go-to-market strategy focused on ensuring our customers’ success in the future of Enterprise AI.

The Future of Enterprise AI, Delivered Today

If the Big Data era was this century’s gold rush, then AI is the next moon shot. Hyperbole doesn’t really apply when you speak to the potential impact of AI for every business and person on earth (and beyond). But, what is essential to putting AI into practice to improve productivity? Again, it’s all about the data, but more specifically, trusted data so that you can trust in Enterprise AI. 

Only Cloudera has the ability to help organizations overcome the three barriers to trust in Enterprise AI:

  • Readiness – Can you trust the safety of your proprietary data in public AI models? Cloudera’s true hybrid approach ensures you can leverage any deployment, from virtual private cloud to on-premises data centers, to maximize the use of AI. 
  • Reliability – Can you trust that your data quality will yield useful AI results? With Cloudera’s modern data architectures, you can ensure your data is of high quality, well-governed, and managed as a single data estate.
  • Responsibility – Can you trust your AI models will give meaningful insight? Cloudera’s support for both open and closed models for enterprise AI available to all form factors ensures you have the choice, flexibility, and ability to cross-compare and ensure useful outcomes that you can trust.

With last week’s acquisition of Verta’s operational AI platform, we are deepening our technology and talent to accelerate AI innovation and, more specifically, simplify the process of bolstering customers’ private datasets to build retrieval-augmented generation (RAG) and fine-tuning applications. As a result, developers – regardless of their expertise in machine learning – will be able to develop and optimize business-ready large language models (LLMs). These bold acquisitions, a continual release of innovations, and key partnerships from the ecosystem, including NVIDIA, will enable all companies to prosper in the Enterprise AI era.

The Enterprise Runs on Cloudera

Innovative technology aside, the best evidence to show how a vendor has evolved to meet the business-critical use cases of its customers are through success stories. 

Cloudera plays a central role not only at work but in all of our daily lives – from the money you save and spend, to the energy and connectivity in your home, to the car you are driving (and your insurance rates), to the phone and network that you are using, to the life-saving drugs and healthcare that keep you and your loved ones healthy.

A recent customer story – OCBC Bank has accelerated its data strategy with Cloudera – illustrates the power of Cloudera for machine learning use cases, particularly in the area of GenAI with business impact:

  • OCBC’s Next Best Conversation, a centralized platform that uses machine learning to analyze real-time contextual data from customer conversations related to sales, service and more. The bank increased their revenue by more than $100M annually by using the data to identify the most relevant information for each customer and curate personal experiences across communication channels. 
  • OCBC also developed a credit card fraud detection solution that reduced the volume of transactions reviewed by anti-money laundering compliance analysts and increased the accuracy rate of identifying suspicious transactions. They developed smarter processes on the platform by introducing chatbots to take over 10% of customer interactions on their website.

But, What Happened to Hadoop?

Many of our customers store and manage their data – much of it unstructured – in HDFS, particularly in on-prem environments. And, with the growing popularity of object storage, we support a variety of S3 object stores from our partners for customers who want cloud-native architectures delivered on public and private clouds. 

That open approach is key to enabling our customers to analyze data wherever it resides. Instead of moving the data each time to the compute that you want to use, you just keep all the data in its current place and bring the compute to the data. That is the key to our open data lakehouse architecture.

Also, we have seen significant adoption in Apache Ozone, a scalable, redundant, and distributed object store optimized for big data workloads running on-premises. In fact, Cloudera customers have just exceeded 1 exabyte of data stored in Ozone. In addition, customers can use the Ozone file system with key Apache technologies, including Apache Hive, Apache Spark, and Apache Iceberg, as well as any S3-compatible workload. 

Those are just a few examples of how Cloudera constantly evolves with customer-led innovation to prepare everyone for a truly open future of data, analytics, and AI. That’s today’s Cloudera.

To learn more about groundbreaking innovations and customer stories, join us at Cloudera EVOLVE, the industry’s premier data and AI conference. We hope to see you there. 

Jeff Healey
Senior Vice President, Product Marketing
More by this author

Leave a comment

Your email address will not be published. Links are not permitted in comments.