Please join us on March 24 for Future of Data meetup where we do a deep dive into Iceberg with CDP What is Apache Iceberg? Apache Iceberg is a high-performance, open table format, born-in-the cloud that scales to petabytes independent of the underlying storage layer and the access engine layer. By being a truly open […]
Data Science tools, algorithms, and practices are rapidly evolving to solve business problems on an unprecedented scale. This makes data science one of the most exciting fields to be in. As exciting as it is, practitioners face their fair share of challenges. There are well-known barriers that slow down predictive modeling or application development. Finding […]
The world cannot ignore the horrific invasion of Ukraine and the plight of the Ukrainian people, who are facing death and devastation in the defense of their country. Our Cloudera team members and their families in Ukraine have been impacted in ways we cannot imagine, and their safety is our top priority. Cloudera – and […]
With the launch of CDP Public Cloud 7.2.14, Cloudera Streams Messaging for Data Hub deployments has gotten some powerful new features! In this release, the Streams Messaging templates in Data Hub will come with Apache Kafka 2.8 and Cruise Control 2.5 providing new core features and fixes. KConnect has been added and gains additional capabilities […]
Note: this publication has been updated from original blog post published on June 3, 2021 Cloudera Support’s cluster validations proactively identify known problem signatures contained in customers’ diagnostic data with the goal of increasing cluster health, performance, and overall stability. Cluster validations are included in a customer’s enterprise subscription at no additional cost. All customers […]
Apache Impala is used today by over 1,000 customers to power their analytics in on premise as well as cloud-based deployments. Large user communities of analysts and developers benefit from Impala’s fast query execution, helping them get their work done more effectively. For these users performance and concurrency are always top of mind. An important […]
Over the past decade, the successful deployment of large scale data platforms at our customers has acted as a big data flywheel driving demand to bring in even more data, apply more sophisticated analytics, and on-board many new data practitioners from business analysts to data scientists. This unprecedented level of big data workloads hasn’t come […]