Introduction For more than a decade now, the Hive table format has been a ubiquitous presence in the big data ecosystem, managing petabytes of data with remarkable efficiency and scale. But as the data volumes, data variety, and data usage grows, users face many challenges when using Hive tables because of its antiquated directory-based table […]
Iceberg is an emerging open-table format designed for large analytic workloads. The Apache Iceberg project continues developing an implementation of Iceberg specification in the form of Java Library. Several compute engines such as Impala, Hive, Spark, and Trino have supported querying data in Iceberg table format by adopting this Java Library provided by the Apache […]
Since we announced the general availability of Apache Iceberg in Cloudera Data Platform (CDP), we are excited to see customers testing their analytic workloads on Iceberg. We are also receiving several requests to share more details on how key data services in CDP, such as Cloudera Data Warehousing (CDW), Cloudera Data Engineering (CDE), Cloudera Machine […]
Cloudera’s open source licensing policies have evolved with the changing dynamics in open source innovation. For more information on Cloudera’s current policy, please contact OSSQuestions@cloudera.com. We are now well into 2022 and the megatrends that drove the last decade in data—The Apache Software Foundation as a primary innovation vehicle for big data, the arrival of […]