In June 2022, Cloudera announced the general availability of Apache Iceberg in the Cloudera Data Platform (CDP). Iceberg is a 100% open-table format, developed through the Apache Software Foundation, which helps users avoid vendor lock-in and implement an open lakehouse. The general availability covers Iceberg running within some of the key data services in CDP, […]
Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. In recent years, the term “data lakehouse” was coined to describe this architectural pattern of tabular analytics over data in the data lake. […]
Data holds incredible untapped potential for Australian organisations across industries, regardless of individual business goals, and all organisations are at different points in their data transformation journey with some achieving success faster than others. To be successful, the use of data insights must become a central lifeforce throughout an organisation and not just reside within […]
Please join us on March 24 for Future of Data meetup where we do a deep dive into Iceberg with CDP What is Apache Iceberg? Apache Iceberg is a high-performance, open table format, born-in-the cloud that scales to petabytes independent of the underlying storage layer and the access engine layer. By being a truly open […]
Apache Impala is used today by over 1,000 customers to power their analytics in on premise as well as cloud-based deployments. Large user communities of analysts and developers benefit from Impala’s fast query execution, helping them get their work done more effectively. For these users performance and concurrency are always top of mind. An important […]
Over the past decade, the successful deployment of large scale data platforms at our customers has acted as a big data flywheel driving demand to bring in even more data, apply more sophisticated analytics, and on-board many new data practitioners from business analysts to data scientists. This unprecedented level of big data workloads hasn’t come […]
Since the release of Cloudera Data Engineering (CDE) more than a year ago, our number one goal was operationalizing Spark pipelines at scale with first class tooling designed to streamline automation and observability. In working with thousands of customers deploying Spark applications, we saw significant challenges with managing Spark as well as automating, delivering, […]