Cloudera has been working on Apache Ozone, an open-source project to develop a highly scalable, highly available, strongly consistent distributed object store. Ozone is able to scale to billions of objects and hundreds petabytes of data. It enables cloud-native applications to store and process mass amounts of data in a hybrid multi-cloud environment and on […]
In this blog post we describe how to use the REST API of Apache Solr in CDP Public Cloud directly or through Apache Knox Gateway. We show indicative performance measurements, demonstrating the performance improvement one can achieve by reusing cookies set by the Knox Gateway.
Information technology has been at the heart of governments around the world, enabling them to deliver vital citizen services, such as healthcare, transportation, employment, and national security. All of these functions rest on technology and share a valuable commodity: data. Data is produced and consumed in ever-increasing amounts and therefore must be protected. After all, […]
Inevitably in any production deployment, the number of kafka nodes required to maintain cluster changes. Balancing performance and cloud costs requires that administrators scale up/scale down accordingly.
Cloudera DataFlow for the Public Cloud (CDF-PC) is a cloud-native service for Apache NiFi within the Cloudera Data Platform (CDP). CDF-PC enables organizations to take control of their data flows and eliminate ingestion silos by allowing developers to connect to any data source anywhere with any structure, process it, and deliver to any destination using […]
The promise of a modern data lakehouse architecture Imagine having self-service access to all business data, anywhere it may be, and being able to explore it all at once. Imagine quickly answering burning business questions nearly instantly, without waiting for data to be found, shared, and ingested. Imagine independently discovering rich new business insights from […]
Apache HBase has long been the database of choice for business-critical applications across industries. This is primarily because HBase provides unmatched scale, performance, and fault-tolerance that few other databases can come close to. Think petabytes of data spread across trillions of rows, ready for consumption in real-time. While application developers and database admins are well […]
Data is the fuel that drives government, enables transparency, and powers citizen services. But while state and local governments seek to improve policies, decision making, and the services constituents rely upon, data silos create accessibility and sharing challenges that hinder public sector agencies from transforming their data into a strategic asset and leveraging it for […]