Data Warehouse Archives - Cloudera Blog

May 30, 2024 | Business

Bringing Financial Services Business Use Cases to Life: Leveraging Data Analytics, ML/AI, and Gen AI

The financial services industry is undergoing a significant transformation, driven by the need for data-driven insights, digital transformation, and compliance with evolving regulations. In this context, Cloudera and TAI Solutions have partnered to help financial services customers accelerate their data-driven transformation, improve customer centricity, ensure compliance with regulations, enhance risk management, and drive innovation. Cloudera […]

by Joe Rodriguez 3 min read

February 12, 2024 | Technical

DNS Zone Setup Best Practices on Azure

Deep dive for using DNS with Cloudera Data Services on Azure

by Dongkai Yu 7 min read

CDP Public Cloud Cloudera Data Platform (CDP) Data Engineering Data Warehouse DataFlow Machine Learning

February 8, 2024 | Technical

Accelerating Queries on Iceberg Tables with Materialized Views

Overview This blog post describes support for materialized views for the Iceberg table format in Cloudera Data Warehouse. Apache Iceberg is a high-performance open table format for petabyte-scale analytic datasets. It has been designed and developed as an open community standard to ensure compatibility across languages and implementations. It brings the reliability and simplicity of […]

by Aman Sinha , Krisztian Kasa 8 min read

CDP Public Cloud Cloudera Data Platform (CDP) Data Warehouse Performance

January 19, 2024 | Technical

Setting up and Getting Started with Cloudera’s New SQL AI Assistant

As described in our recent blog post, an SQL AI Assistant has been integrated into Hue with the capability to leverage the power of large language models (LLMs) for a number of SQL tasks. It can help you to create, edit, optimize, fix, and succinctly summarize queries using natural language. This is a real game-changer […]

by Björn Alm , Mohammed Tabraiz , Sreenath Somarajapuram 9 min read

CDP Public Cloud Cloudera Data Platform (CDP) Data Hub Data Warehouse Customer Analytics Data Science

December 21, 2023 | Business

Introducing the SQL AI Assistant:Create, Edit, Explain, Optimize, and Fix Any Query

Increase your SQL development productivity with the new SQL AI Assistant

by David Dichmann 7 min read

Data Hub Data Warehouse

November 7, 2023 | Technical

Apache Ozone – A Multi-Protocol Aware Storage System

Bucket Layouts in Apache Ozone

by Saketa Chandra Chalamchala , Ethan Rose 5 min read

CDP Private Cloud Cloudera Data Platform (CDP) Data Engineering Data Warehouse Machine Learning Data Ingestion

October 4, 2023 | Technical

Don’t Blink: You’ll Miss Something Amazing!

Fast moving data and real time analysis present us with some amazing opportunities. Don’t blink—or you’ll miss it! Every organization has some data that happens in real time, whether it is understanding what our users are doing on our websites or watching our systems and equipment as they perform mission critical tasks for us. This […]

by David Dichmann 4 min read

Data Hub Data Warehouse

September 15, 2023 | Business

Telecommunications Data Monetization Strategies in 5G and beyond with Cloudera and AWS

The world is awash with data, no more so than in the telecommunications (telco) industry. With some Cloudera customers ingesting multiple petabytes of data every single day— that’s multiple thousands of terabytes!—there is the potential to understand, in great detail, how people, businesses, cities and ecosystems function. This information is essential for the management of […]

by Anthony Behan , Jon Penrose 5 min read

CDP Public Cloud Data Hub Data Warehouse Machine Learning Operational DB SDX Technologies Telecommunications Customer Analytics

August 7, 2023 | Technical

HDFS Snapshot Best Practices

Introduction The snapshots feature of the Apache Hadoop Distributed Filesystem (HDFS) enables you to capture point-in-time copies of the file system and protect your important data against corruption, user-, or application errors. This feature is available in all versions of Cloudera Data Platform (CDP), Cloudera Distribution for Hadoop (CDH) and Hortonworks Data Platform (HDP). Regardless […]

by Tsz Sze 7 min read

CDP Private Cloud CDP Public Cloud Data Warehouse Data Ingestion

July 13, 2023 | Technical

12 Times Faster Query Planning With Iceberg Manifest Caching in Impala

Iceberg is an emerging open-table format designed for large analytic workloads. The Apache Iceberg project continues developing an implementation of Iceberg specification in the form of Java Library. Several compute engines such as Impala, Hive, Spark, and Trino have supported querying data in Iceberg table format by adopting this Java Library provided by the Apache […]

by Riza Suminto 7 min read

Apache Iceberg Apache Impala CDP Public Cloud Data Warehouse Performance

Filter By