In June 2022, Cloudera announced the general availability of Apache Iceberg in the Cloudera Data Platform (CDP). Iceberg is a 100% open-table format, developed through the Apache Software Foundation, which helps users avoid vendor lock-in and implement an open lakehouse. The general availability covers Iceberg running within some of the key data services in CDP, […]
Z-order is an ordering for multi-dimensional data, e.g. rows in a database table. Once data is in Z-order it is possible to efficiently search against more columns. This article reveals how Z-ordering works and how one can use it with Apache Impala. In a previous blog post, we demonstrated the power of Parquet page indexes, […]
Fine grained access control (FGAC) with Spark Apache Spark with its rich data APIs has been the processing engine of choice in a wide range of applications from data engineering to machine learning, but its security integration has been a pain point. Many enterprise customers need finer granularity of control, in particular at the column […]
The telco and financial services industries within APAC have been at the forefront of adopting data and analytics to deliver differentiated products and services. With mature enterprise data strategies helping organizations achieve 5.97% higher profit growth,here’s how a modern data architecture and hybrid approach to data can propel them even higher.
Co-author: Mike Godwin, Head of Marketing, Rill Data Cloudera has partnered with Rill Data, an expert in metrics at any scale, as Cloudera’s preferred ISV partner to provide technical expertise and support services for Apache Druid customers. We want Cloudera customers that rely on Apache Druid to know that their clusters are secure and supported […]
This month, Cloudera Cares is excited to spotlight Burt Wagner, senior solutions engineer from Alexandria, Virginia. Burt—who joined Cloudera earlier this year— volunteers regularly with the Boy Scouts of America. He started Scouting as an eight year old; it has always been an integral part of his life and something he now enjoys sharing with […]
In part 1 of this blog we discussed how Cloudera DataFlow for the Public Cloud (CDF-PC), the universal data distribution service powered by Apache NiFi, can make it easy to acquire data from wherever it originates and move it efficiently to make it available to other applications in a streaming fashion. In this blog we […]